mirror of
https://github.com/abraunegg/onedrive
synced 2026-03-14 14:35:46 +01:00
Update client-architecture.md
* Update formatting
This commit is contained in:
parent
bee494a163
commit
1341396da7
1 changed files with 26 additions and 26 deletions
|
|
@ -126,36 +126,36 @@ This section explains what’s happening under the hood, why it can take time on
|
|||
### Application Processing Steps: Where the time goes (phase-by-phase)
|
||||
The flow diagrams above show the main application decision points. The log lines below correspond to the key phases you’ll see during a typical run (standalone or monitor).
|
||||
|
||||
1. Fetch current changes from Microsoft Graph
|
||||
* Application Output: `Fetching items from the OneDrive API for Drive ID: …`
|
||||
* What happens: The client requests change bundles (≈200 items per bundle) using your current delta token. If the token is invalid or it’s a first run, it performs a broader enumeration.
|
||||
* Why it can be slow: High item counts, network latency, or Microsoft Graph API throttling.
|
||||
2. Process each bundle of changes
|
||||
* Application Output: `Processing N applicable changes and items received from Microsoft OneDrive`
|
||||
* What happens: For each item, we classify (new/changed/deleted/excluded), reconcile with the local database, and queue any work (download, upload, delete, rename).
|
||||
* Why it can be slow: Many directories and small files increase metadata churn; each bundle must be applied in order. Bundle size is fixed by Graph.
|
||||
3. Database integrity checks
|
||||
* Application Output: `Performing a database consistency and integrity check on locally stored data`
|
||||
* What happens: The application will verify local metadata invariants so subsequent actions don’t corrupt state. This is quick on SSDs but can be noticeable on slow disks.
|
||||
4. Local filesystem scan
|
||||
* Application Output: `Scanning the local file system '…' for new data to upload`
|
||||
* What happens: The application will walk your configured sync root, applying client-side filtering rules and discovering local items to upload.
|
||||
* Why it can be slow: Deep folder trees; slow or network filesystems; A complex local state with deep trees with potentially exlusions and inclusions to filter and determine what needs to be uploaded.
|
||||
5. Final reconciliation & actions
|
||||
* Application Output: `Number of items to download from Microsoft OneDrive: X`
|
||||
* What happens: The application will execute the final action queues. On a healthy delta run this step is short; on a first run or after --resync it can be significant.
|
||||
* Why it can be slow: Many small files; constrained bandwidth; server-side throttling.
|
||||
1. **Fetch current changes from Microsoft Graph**
|
||||
* **Application Output:** `Fetching items from the OneDrive API for Drive ID: …`
|
||||
* **What happens:** The client requests change bundles (≈200 items per bundle) using your current delta token. If the token is invalid or it’s a first run, it performs a broader enumeration.
|
||||
* **Why it can be slow:** High item counts, network latency, or Microsoft Graph API throttling.
|
||||
2. **Process each bundle of changes**
|
||||
* **Application Output:** `Processing N applicable changes and items received from Microsoft OneDrive`
|
||||
* **What happens:** For each item, we classify (new/changed/deleted/excluded), reconcile with the local database, and queue any work (download, upload, delete, rename).
|
||||
* **Why it can be slow:** Many directories and small files increase metadata churn; each bundle must be applied in order. Bundle size is fixed by Graph.
|
||||
3. **Database integrity checks**
|
||||
* **Application Output:** `Performing a database consistency and integrity check on locally stored data`
|
||||
* **What happens:** The application will verify local metadata invariants so subsequent actions don’t corrupt state. This is quick on SSDs but can be noticeable on slow disks.
|
||||
4. **Local filesystem scan**
|
||||
* **Application Output:** `Scanning the local file system '…' for new data to upload`
|
||||
* **What happens:** The application will walk your configured sync root, applying client-side filtering rules and discovering local items to upload.
|
||||
* **Why it can be slow:** Deep folder trees; slow or network filesystems; A complex local state with deep trees with potentially exlusions and inclusions to filter and determine what needs to be uploaded.
|
||||
5. **Final reconciliation & actions**
|
||||
* **Application Output:** `Number of items to download from Microsoft OneDrive: X`
|
||||
* **What happens:** The application will execute the final action queues. On a healthy delta run this step is short; on a first run or after --resync it can be significant.
|
||||
* **Why it can be slow:** Many small files; constrained bandwidth; server-side throttling.
|
||||
|
||||
### Why a --resync is slower (by design)
|
||||
A `--resync` discards the known-good delta token and forces a full online + local walk to re-learn state. This is essential after certain errors or configuration change, but using it routinely will always cost more time than an incremental run. After the first successful scan, subsequent syncs drop from hours to minutes because the delta token narrows the change set dramatically.
|
||||
|
||||
### What affects performance the most
|
||||
* Item count & structure: Many folders and small files dominate metadata work.
|
||||
* Network quality: Latency and throughput directly affect how quickly we can iterate Graph pages and transfer content.
|
||||
* Local Disk & filesystem: SSDs perform metadata and DB work far faster than spinning disks or remote mounts. Your filesystem type (e.g., ext4, XFS, ZFS) matters and should be tuned appropriately.
|
||||
* File Indexing: Disable File Indexing (Tracker, Baloo, Searchmonkey, Pinot and others) as these are adding latency and disk I/O to your operaions slowing down your performance.
|
||||
* CPU & memory: Classification and hashing are CPU-bound; insufficient RAM or swap can slow DB and traversal work.
|
||||
* First run vs incremental: First runs / `--resync` must enumerate everything; incremental runs use the delta token and are much faster.
|
||||
* **Item count & structure:** Many folders and small files dominate metadata work.
|
||||
* **Network quality:** Latency and throughput directly affect how quickly we can iterate Graph pages and transfer content.
|
||||
* **Local Disk & filesystem:** SSDs perform metadata and DB work far faster than spinning disks or remote mounts. Your filesystem type (e.g., ext4, XFS, ZFS) matters and should be tuned appropriately.
|
||||
* **File Indexing:** Disable File Indexing (Tracker, Baloo, Searchmonkey, Pinot and others) as these are adding latency and disk I/O to your operaions slowing down your performance.
|
||||
* **CPU & memory:** Classification and hashing are CPU-bound; insufficient RAM or swap can slow DB and traversal work.
|
||||
* **First run vs incremental:** First runs / `--resync` must enumerate everything; incremental runs use the delta token and are much faster.
|
||||
|
||||
### Practical ways to improve throughput
|
||||
1. Avoid unnecessary `--resync`. Only use it when the client explicitly advises you to. It forces a full scan.
|
||||
|
|
@ -178,7 +178,7 @@ The sync engine will generate a simulated delta whenever any of the conditions b
|
|||
3. The use of `--download-only --cleanup-local-files`. In this mode, consuming raw /delta can replay online delete/replace churn in a way that causes valid local files to be deleted (e.g., user deletes a folder online, then recreates it via the web). The simulated delta captures the present online state and intentionally ignores those intermediate delete/replace events, so local “keep” semantics are preserved.
|
||||
4. The uuse of 'Shared Folders'. Calling `/delta` on a shared folder path often targets the owner’s entire drive, not just the shared subtree you see. With sync_list, this mismatch can mean nothing appears to match (paths are rooted from the owner’s drive, not your shared mount point). The client therefore walks the shared folder itself, normalises paths, and constructs a simulated delta that reflects exactly what’s shared with you.
|
||||
|
||||
Why this is slower: A simulated delta requires walking the online tree (and, for large or deeply nested shares, that’s work). The trade-off is deliberate: safety and correctness over speed.
|
||||
**Why this is slower:** A simulated delta requires walking the online tree (and, for large or deeply nested shares, that’s work). The trade-off is deliberate: safety and correctness over speed.
|
||||
|
||||
|
||||
## File conflict handling - default operational modes
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue