Flushing
When running evaluations with large datasets, you may experience a long period of time before program execution, while the dataset is being uploaded in background threads. This generally occurs when main thread execution finished before background cleanup is complete. Callingclient.flush() will force all background tasks to be processed in the main thread, ensuring parallel processing during main thread execution. This can improve performance when user code completes before data has been uploaded to the server.
Example:
Increasing client parallelism
Client parallelism is automatically determined based on the environment, but can be set manually using the following environment variable:WEAVE_CLIENT_PARALLELISM: The number of threads available for parallel processing. Increasing this number will increase the number of threads available for parallel processing, potentially improving the performance of background tasks like dataset uploads.
settings argument to weave.init():
Troubleshooting