i want to make a rule for discovery scans on local file system and email, but it takes a long time even when classification is configured, can anyone tell me why it takes a long time thousands of hours. and how to configure it correctly?
There are multiple variables that could potentially cause this. However, since you mentioned both file system and email discovery scans, the first thing that comes to mind is what the CPU and RAM utilization looks like on the system as you're running the scans. Does the overall CPU or RAM utilization exceed 50%? If so, what could be occurring is the scans are pausing until utilization is lower, thus prolonging the amount of time the scan takes to complete. If you find that this is the case, this threshold can be modified from the default 50%. The settings to pause endpoint scans based on CPU/RAM utilization are in the Windows Client Configuration policy > Discovery (Endpoint).
If you have the default values of 50% as seen in the attached screenshot, then that looks like that would be expected behavior. The settings to suspend the scans based on CPU and Memory utilization are based on overall utilization and not just DLP processes. Your screenshot shows both CPU and RAM utilization over 50% and thus the endpoint scan is paused. The scan would resume once the overall utilization for CPU and RAM are below 50%. These values can be changed in the policy if needed.
Additionally, it would be recommended to schedule DLP discovery tasks at times when other tasks that are resource intensive are not running. For example, running a DLP Discovery scan at the same time as an AV scan should be avoided as both running together could consume a large amount of resources.
DLP Endpoint Discovery scans to perform some caching. That is, if a file has already been scanned, it should not be scanned again unless something changes with the file. This may not necessarily be a content change, but could be a change in file properties such as the last modified or last accessed time stamp.