Difference between revisions of "Package:Rsync"

Line 12: Line 12:
 
When using rsync to synchronize files over a network connection, keep in mind that rsync, by default, uses the file's ''modification time and size'' to determine if a file at the destination needs to be updated. This is important to note because by default, rsync does not update file modification times on the destination system. This has important implications for performance when rsync is run again to synchronize the same files.
 
When using rsync to synchronize files over a network connection, keep in mind that rsync, by default, uses the file's ''modification time and size'' to determine if a file at the destination needs to be updated. This is important to note because by default, rsync does not update file modification times on the destination system. This has important implications for performance when rsync is run again to synchronize the same files.
  
Without the {{c|-a}} or {{c|-t}} option specified, rsync will check the file size (which will match) and modification time (which will not,) and thus assume the file is different. This will cause rsync to use its delta-transfer algorithm to attempt to update the file over the network. The delta-transfer algorithm has been optimized to minimize network utilization, but it still causes both the local system to load the entire local file from disk, and the remote system to load the entire remote file from disk, in order to calculate checksums. This means that a 50GB file, when synchronized this way, will cause about 50GB of disk IO locally, and about 50GB of disk IO on the remote system. This can be quite slow, especially when transmitting large quantities of data.
+
Without the {{c|-a}} or {{c|-t}} option specified, rsync will check the file size (which will match) and modification time (which will not,) and thus assume the file is different. This will cause rsync to use its delta-transfer algorithm to attempt to update the file over the network. The delta-transfer algorithm has been optimized to minimize network utilization, but it still causes both the local and the remote system to load the entire file from disk in order to calculate checksums. This means that a 50GB file, when synchronized this way, will cause about 50GB of disk IO locally, and about 50GB of disk IO on the remote system. This can slow things down significantly, especially when transmitting large quantities of data, or when one or both systems are already experiencing heavy IO load.
  
 
The solution to this problem is to use the {{c|-t}} option (enabled as part of {{c|-a}} as well) to enable modification time updates. When you do this, the modification time of the remote file will be updated to match that of the local file. Then, on a successive rsync invocation, rsync will compare the local and remote size and modification time, find that they both match, and will not invoke the delta-transfer algorithm. Congratulations -- if the files you were rsyncing were 50GB, then you just saved about 100GB of disk IO.
 
The solution to this problem is to use the {{c|-t}} option (enabled as part of {{c|-a}} as well) to enable modification time updates. When you do this, the modification time of the remote file will be updated to match that of the local file. Then, on a successive rsync invocation, rsync will compare the local and remote size and modification time, find that they both match, and will not invoke the delta-transfer algorithm. Congratulations -- if the files you were rsyncing were 50GB, then you just saved about 100GB of disk IO.
 
{{EbuildFooter}}
 
{{EbuildFooter}}

Revision as of 05:21, January 15, 2015

net-misc/rsync


Source Repository:Gentoo Portage Tree
Homepage

Summary: Rsync is a very fast file copying tool that has been optimized to synchronize files efficiently over a network connection.


News

Drobbins

New Media Mix-ins

Funtoo Linux now has new media mix-ins. Learn about them and how to use them.
11 January 2015 by Drobbins
Drobbins

The Many Builds of Funtoo Linux

We now have lots of different builds of Funtoo Linux for various CPUs, as well as Hardened, Stable and ARM, and a new UI to browse them. Learn more here.
25 December 2014 by Drobbins
Oleg

Python Updater Deprecation

Python-updater is no longer part of Funtoo Linux.
6 December 2014 by Oleg
View More News...

Rsync

Tip

This is a wiki page. To edit it, Create a Funtoo account. Then log in and then click here to edit this page. See our editing guidelines to becoming a wiki-editing pro.

Rsync Tips

Enable Timestamp Updates

Here's an important tip for maximizing rsync performance over a network connection.

When using rsync to synchronize files over a network connection, keep in mind that rsync, by default, uses the file's modification time and size to determine if a file at the destination needs to be updated. This is important to note because by default, rsync does not update file modification times on the destination system. This has important implications for performance when rsync is run again to synchronize the same files.

Without the -a or -t option specified, rsync will check the file size (which will match) and modification time (which will not,) and thus assume the file is different. This will cause rsync to use its delta-transfer algorithm to attempt to update the file over the network. The delta-transfer algorithm has been optimized to minimize network utilization, but it still causes both the local and the remote system to load the entire file from disk in order to calculate checksums. This means that a 50GB file, when synchronized this way, will cause about 50GB of disk IO locally, and about 50GB of disk IO on the remote system. This can slow things down significantly, especially when transmitting large quantities of data, or when one or both systems are already experiencing heavy IO load.

The solution to this problem is to use the -t option (enabled as part of -a as well) to enable modification time updates. When you do this, the modification time of the remote file will be updated to match that of the local file. Then, on a successive rsync invocation, rsync will compare the local and remote size and modification time, find that they both match, and will not invoke the delta-transfer algorithm. Congratulations -- if the files you were rsyncing were 50GB, then you just saved about 100GB of disk IO.