fs/pohmelfs/README

   1 Pohmelfs is a POSIX frontend to elliptics distributed network build on top of DHT design
   2 You may find more about elliptics at http://www.ioremap.net/projects/elliptics
   3 Or example pohmelfs raid1 configuration at http://www.ioremap.net/node/535
   4
   5 Here I will desribe pohmelfs mount options
   6
   7 server=addr:port:family
   8 Remote node to connect (family may be 2 for IPv4 and 6 for IPv6)
   9 You may specify multiple nodes, usually it is ok to put here only subset
  10 of all remote nodes in cluster, pohmelfs will automatically discover other nodes
  11
  12 fsid=<string>
  13 Filesystem ID - you may have multiple filesystems in the same elliptics cluster
  14 This ID may be thought of as container or namespace identity
  15 By default it is 'pohmelfs' (without quotes)
  16
  17 sync_timeout=<int>
  18 Timeout in seconds used to synchronize local cache with the storage
  19 In particular all pending writes will be flushed to storage.
  20 If you read directory, which previously was read more than 'sync_timeout' seconds,
  21 it will be reread from storage, otherwise it will be read from local cache.
  22 The same logic _will_ apply to file content, right now once read, file will not
  23 be reread again until cache is dropped
  24
  25 groups=<int>:<int>:...
  26 You may specify group IDs to store data to.
  27 One may think about group ID as replica ID, i.e. if you specify groups=1:2:3,
  28 each write will put data into groups with IDs 1, 2 and 3
  29 Read will fetch data from group 1 first, then 2 and 3
  30 If your replicas are not in sync, read will fetch elliptics metadata first,
  31 determine which replica has the most recent data, and will first try to read
  32 that group
  33
  34 http_compat=<int>
  35 Specifies whether to use hash of full path name as inode ID (512 bits, sha512 is used)
  36 Provided number limits number of temporal pages allocated for path traversal, i.e.
  37 number of parallel pathes hashed
  38 Having something like 5-10 is ok for common cases
  39
  40 readcsum/noreadcsum
  41 Specifies whether to turn on or off remote checksumming
  42 Having read csums for large files may be not a very good idea, since every read
  43 will force server to check whole file checksum, so for multi-gigabyte files read
  44 of the single page may take a while (until it is already cached)
  45
  46 successful_write_count=<num>
  47 If not specified, write will be considered successful only if quorum
  48 (number of groups above / 2 + 1) of writes succeeded. You may alter this number
  49 by given option.
  50 Please note, that if write does not succeed, error may only be detected as returned
  51 value from sync() or close() syscall. Also, unsuccessful write is rescheduled and
  52 all its pages are redirtied again to be resent in future.
  53
  54 keepalive_idle=<int>
  55 Number of seconds to wait before starting to send first TCP keepalive message
  56
  57 keepalive_cnt=<int>
  58 Number of TCP keepalive messages to send before closing connection
  59
  60 keepalive_interval=<int>
  61 Number of seconds between TCP keepalive messages
  62
  63 readdir_allocation=<int>
  64 Number of pages allocated in one kmalloc() call when reading directory content from server
  65 Please note that higher-order allocations may fail, but low-ordered (like 1 or 2 pages)
  66 ends up in slow directory read for large directories.
  67 It may take up to couple of seconds to read directory with several thousands of entries,
  68 but usually because VFS will call ->lookup() method to every directory entry
  69
  70 sync_on_close
  71 Forces flushing inode (and its data) to disk when file is closed
  72
  73 connection_pool_size=<int>
  74 Number of simultaneous connections to every remote node. Connections are selected
  75 in round-robin fashion, but 1/4 of them (or at least one) are reserved for small-sized requests,
  76 which usually carry metadata messages like directory listing or file lookup requests.
  77 Messing them with bulk IO requests is always a bad idea.
  78
  79 read_wait_timeout=<int>/write_wait_timeout=<int>
  80 Maximum number of milliseconds to wait for appropriate request to complete.
  81 By default both are equal to 5 seconds, which is not always a good idea especially for huge
  82 readahead, big cache writeback intervals and/or rather slow disks.
  83 These timeouts are used not only for IO requests, but also for metadata commands like
  84 directory listing or object lookup.