logo

New Response

« Return to the blog entry

You are replying to:

  1. Just implemented a small scale version of something similar here. I took the approach to trust only the data at the server, rather than scoped server memory variables (as suggested by others) of some flavor, and used asynchronous client side calls where each response updates the progress indicator. In this case, for now, just a debug console but will later be a progress bar like yours.

    The advantage of handling each document as a single request and trusting only the data on the server is that the process can be aborted at any time and neither progress nor data is lost. When the user comes back if they lose connection, the process starts off by asking the server for the unprocessed data as JSON.

    Also, as it's asynchronous, other work can continue while this is going on. If I had thousands of documents though, I can see this approach being much slower than yours. I could clump or batch documents with each call but if there were a problem with one document in the batch, it'd be difficult to suss-out where the problem was and where to resume efficiently.... though I suppose flagging failed docs and excluding from the next batch might work too. The other advantage of this approach is it scales *down*, which is often over looked when considering batch processes. I try to make logical units operate on single items and scale up to other management units as needed rather than start big and try to reverse engineer the scaling down as this can cause a lot of rewrites of code and design.

    I agree with you, though, it's horses for courses. Each situation has unique requirements to consider... in this case, robustness was desired more-so than speed. Altering this balance would yield a solution similar to yours or Marks.

Your Comments

Name:
E-mail:
(optional)
Website:
(optional)
Comment: