= FhGFS Changelog (2012.10 Release Series) = == Changes in 2012.10-r4 == * client: Temorarily disabled kernel NFS export and open_by_handle support because of known problems: In some cases, files could have become inaccessible after moving until the file system was remounted. (Regression from r2.) == Changes in 2012.10-r3 == * client: Fixed internal file mds reference update after file move to another metadata server (regression from r2). * meta: Fixed error return code when unlink() and close() were racing together on the same file. == Changes in 2012.10-r2 == * client: Updated to be compatible with linux-2.6.16 up to linux-3.9. [Thanks to Taras Shevchenko National University of Kyiv for the preparatory work.] * client: Improved async buffer flushes (don't wait for locked buffers). * client: Always use 32bit inode numbers on 32bit systems. [Thanks to Dattatec for reporting.] * client: Updated comments regarding standard OFED kernel include path for new OFED releases (/usr/src/openib). * client: Added session check to detect server cache loss with read/write/fsync messages. (Only enabled if all servers run at least version 2012.10-r2.) * client: New config file option "sysSyncOnClose" to force buffer flush when a file is closed. * client: New config file option "sysSessionCheckOnClose" to perform an extra session check when a file is closed. * client: Fixed "CPU stuck for more than n seconds" kernel log message while a storage server was unresponsive. * client: Added support for open_by_handle syscall and kernel NFS server export (experimental, requires linux-2.6.29 or higher). * client: Init script updated to detect all currently active fhgfs mounts. * client: Fixed return code (EFAULT) when an invalid userspace buffer is passed to read()/write() in buffered mode. * client: Fixed file lookup+create problem on RHEL 5.x kernels. [Thanks to INRIA for reporting.] * client: Removed "dynamic" inode number style option from config. * client_opentk: Fixed Infiniband compile with RHEL 5.9 kernel (2.6.18-348). [Thanks to Clustervision for reporting.] * meta: Fixed a rare case where a warning was generated about a directory not being properly referenced. * meta: Several minor internal locking fixes/improvements. * storage: Fixed problem where benchmark files from fhgfs-ctl mode "storagebench" were not correctly removed from an underlying XFS file system. * storage: Fixed problem where fhgfs-ctl mode "clientstats" sometimes showed wrong values for read/write throughput. * fhgfs-ctl: Mirror targets now included in search of modes "find" and "migrate". * fhgfs-ctl: Fixed continue after certain errors in mode "migrate". [Thanks to University of California, Irvine for reporting.] * fhgfs-ctl: Mode "clientstats" now requires fewer network messages to query statistics. (Only if servers run at least version 2012.10-r2.) * fhgfs-ctl: Improved mode "getentryinfo" output for mirrored files. * fsck: Several minor fixes/improvements. * opentk_lib: Log Infiniband errors to syslog. * general: Fixed Debian binaries reporting wrong version numbers. == Changes in 2012.10-r1 == This is a major update with new features and improvements in all areas. If you are upgrading from a previous 2011.04 release, make sure to read the compatibility section below. === New & Noteworthy === * New metadata format with inlined inodes: The new format allows a more efficient internal handling and scales better to large numbers of files. The inode inlining also reduces disk access/seeks and required disk space. * Changes in metadata handling also include more general optimizations, e.g. internal locking optimizations that allow for higher parallelism, especially for modifying metadata operation types like file creation. * Intents for metadata operations are now enabled by default. * New optional metadata mirroring * Asynchronous mirroring of metadata can be enabled/disabled on a per-directory basis, so not all directories have to be mirrored. * The mirroring setting will be derived by new subdirs. * Mirror servers are chosen individually for each new directory, so there are no passive (i.e. mirror-only) servers in such a file system. * See "fhgfs-ctl --mirrormd --help" and http://www.fhgfs.com/wiki/wikka.php?wakka=AboutMirroring for more information. * New optional file contents mirroring * Synchronous mirroring of file contents can be enabled/disabled on a per-directory basis, so not all files have to be mirrored. * The mirroring setting will be derived by new subdirs. * Mirror targets are chosen individually for each new file, so there are no passive (i.e. mirror-only) targets in such a file system. * Mirroring also can be used with an odd number of targets. * See "fhgfs-ctl --setpattern --raid10 --help" and http://www.fhgfs.com/wiki/wikka.php?wakka=AboutMirroring for more information. * New numeric 16-bit server and target IDs: The new release internally uses and stores numeric 16-bit IDs (range 1..65535) instead of the previous string IDs from the 2011.04 series. * The new IDs significantly reduce metadata size. * The former string IDs still exist to provide human-readable names for servers and targets (e.g. in log messages). * As before, all IDs can be assigned automatically or manually. * Automatically assigned IDs are randomly chosen by fhgfs-mgmtd when a new server/target is registered. * As with previous string IDs, numeric IDs can be set manually by creating a file named "targetNumID" or "nodeNumID" in a storage directory. See here for more information on manual assignment: http://www.fhgfs.com/wiki/wikka.php?wakka=FAQ#force_nodeid * Redesigned file system check: The new fhgfs-fsck tool significantly reduces server-load and runtime for a full check by walking the file system only once and doing the analysis steps afterwards on the client which ran the tool. * Currently, clients may not write/modify the file system while the check is running. Online check/repair will be available in an upcoming minor release. * New fhgfs-ondemand script (/opt/fhgfs/sbin) allows very simple creation and destruction of a new FhGFS instance, e.g. on a per-job basis for clusters * The script takes a nodefile as argument and is comparable to running an MPI job. * On a per-job basis for clusters, the script would usually be called by the batch system. * New built-in benchmarking for disk throughput and network throughput: * The client network benchmark mode sends data over the network, but does not perform disk reads/writes on storage servers. * Network benchmarking is activated per-client by using: "echo 1 > /proc/fs/fhgfs//netbench_mode" * The storage targets benchmark mode measures server-side throughput of storage targets without any network data transfer. * See "fhgfs-ctl --storagebench --help" for more information. * New storage server config options were added to control explicit read-ahead and write flushing for sequential IO * See tuneFileReadAhead... and tuneFileWriteSyncSize in fhgfs-storage.conf. * Client logging can now optionally be done via syslog instead of via the fhgfs-helperd daemon. * See option "logType" in fhgfs-client.conf. * Client log level can now be changed at runtime * See /proc/fs/fhgfs//log_levels. * New client/server config option allows setting of Infiniband type-of-service to control quality-of-service (and other options) via opensm config * See config file option "connRDMATypeOfService". * New server config option allows binding of worker threads to NUMA areas * See option "tuneWorkerNumaAffinity" in server config files. * Command line parameters for the fhgfs-ctl tool are now specified in more the commonly known "--param" format with dashes. === Compatibility Notes === * The on-disk storage format has changed with this release series. Upgrading from a previous 2011.04 FhGFS version requires a data conversion. See here for upgrade instructions: http://wiki.fhgfs.com/wikka.php?wakka=ServerUpgrade201210 * Supported distributions: RedHat 5/6 (and Fedora), SLES 10/11 (and OpenSuse), Debian 5/6 (and Ubuntu) * Supported Linux kernel versions (client): 2.6.16 - 3.5 * Servers and clients from different major release series are not compatible and cannot be used together in the same file system instance. * Servers and clients with different minor versions of the 2012.10 release series are compatible and can be used together in the same file system instance.