Frequently Asked Questions (FAQ)


General Questions

  1. Who should use BeeGFS?
  2. Is BeeGFS suitable for production use?
  3. What are the minimum system requirements to run BeeGFS?
  4. Which network types are supported by BeeGFS?
  5. Can I test-drive BeeGFS for free?
  6. Do I need a SAN or any kind of shared storage to run BeeGFS?
  7. Does BeeGFS provide high-availability?
  8. Can I do everything that can be done with the Admon GUI from the command-line?
  9. Is BeeGFS open-source?
  10. What does the abbreviation FhGFS mean?
  11. How does BeeGFS distribute metadata?
  12. How to backup metadata?


Installation & Startup

  1. Where can I find information about why an Admon-based installation of BeeGFS fails?
  2. We are not using one of the distributions you are providing packages for, which means we cannot install RPM or DEB packages. What can we do to try BeeGFS?
  3. Do I need to be ''root'' to run BeeGFS?
  4. Is it possible to run storage servers and clients on the same machine?
  5. Is it possible to run multiple instances of the same daemon on the same machine?
  6. Do I need a special Linux kernel for BeeGFS?
  7. Does BeeGFS need time synchronization between different machines?
  8. How to upgrade from one BeeGFS major release series to another?
  9. Why do I get an 'Access denied' error on the client even with correct permissions?
  10. How much disk-space is required for metadata?


Configuration & Tuning

  1. Where can I find system tuning recommendations?
  2. How can I remove a node from an existing file system?
  3. Is it possible to free up a storage target before removal?
  4. I did some testing and want to start over again. Is there an easy way to delete all residues from the old file system?
  5. My hosts have more than one network interface. Is there a way of configuring BeeGFS to use only certain interfaces or certain networks?
  6. Is it possible to force a certain node ID or target ID for BeeGFS servers?
  7. What needs to be done when a server hostname has changed?
  8. My client refuses to mount because of an 'unknown storage target'
  9. How can I solve problems with the .Xauthority file for home dirs?




[General Questions]



Who should use BeeGFS?


Everyone with an interest in fast, flexible, and easy to manage storage. This is typically the case for HPC clusters, but BeeGFS was designed to also work well on smaller deployments like a group of workstations or even heterogeneous systems with different hardware architectures.


Is BeeGFS suitable for production use?


Yes, absolutely! BeeGFS is not a proof-of-concept or research implementation. BeeGFS was designed for production use since the beginning of its development and is fully supported by Fraunhofer and our international partners.


What are the minimum system requirements to run BeeGFS?


Currently, native BeeGFS client and server components are available for Linux on x86, x86_64 and PowerPC/Cell architectures. In general, all BeeGFS components can run on a single machine with only a single CPU core, a single hard disk and less than 1GB of RAM. But this is probably not what you want to do in practice, so here are some recommendations for your hardware configuration:







Which network types are supported by BeeGFS?


BeeGFS supports all TCP/IP based networks and the native Infiniband protocol (based on OFED ibverbs). Servers and clients can handle requests from different networks at the same time (e.g. your servers can be equipped with Infiniband and Ethernet interconnects and some clients connect via native Infiniband while the rest connects via TCP/Ethernet). Clients with multiple connection paths (like Infiniband and Ethernet or multiple Ethernet ports) can also do network failover if the primary connection path fails.


Can I test-drive BeeGFS for free?


Yes, BeeGFS can be downloaded and used free of charge without any limitations.


Do I need a SAN or any kind of shared storage to run BeeGFS?


No, BeeGFS is not a storage area network (SAN) file system and it does not rely on shared access to storage. Typical hardware configurations for BeeGFS consist of multiple servers with internal (or external non-shared) RAID storage.


Does BeeGFS provide high-availability?


Starting with the 2012.10 release series, BeeGFS provides optional metadata and file contents redundancy (replication). In the 2015.03 release series this concept was extended with additional high availability features. Please see here for more information (or here if you still use a release prior to 2015.03).

Besides that, you are also free to implement cold failover with active/active server pairs based on shared storage and external tools like e.g. Heartbeat or Pacemaker. Though BeeGFS is using write caching on the servers, this will allow access to your existing data in case of a machine failure.
To setup active/active pairs with failover based on external tools, you will need to run a new instance of the failed daemon on another machine, so that two instances of the same daemon are running on one machine (using different network protocol ports) when one machine is down. To make the failover BeeGFS daemon instance appear like the original daemon, you will need to also move IP addresses to the failover machine and make sure the failover daemon instance uses the same NodeID on the failover host. (See section "Configuration & Tuning" on this page for information on how to manually define a NodeID instead of using the hostname as NodeID.)
Note that the customer wiki contains more details on this topic.


Can I do everything that can be done with the Admon GUI from the command-line?


Yes, installation, configuration and status queries can also be done via the command line:


Is BeeGFS open-source?


Currently, only the client source code is available under the GPL.

It was publicly announced at the International Supercomputing Conference 2013 in Leipzig and it is guaranteed as part of the DEEP-ER project that the server components of BeeGFS will also be made available under an open-source license.
Until this happens, we make agreements with some supported customers to share parts of the source code on a per-project basis and can in special cases also provide the full source code to customers for whom it might be business critical.

Anyways, you can download and use all components of BeeGFS free of charge.


What does the abbreviation FhGFS mean?


FhGFS is the old name of BeeGFS®, which is now named after the nice and busy animals that work together as a swarm and do their important job mostly unnoticed in the background.

FhG: The German name of the Fraunhofer Society is Fraunhofer Gesellschaft, its official abbreviation is FhG.
FS: FS is short for File System.

Note: Since the meaning of the letter G in FhGFS is not always obvious, some people started calling FhGFS the Fraunhofer Global File System or Fraunhofer GFS or sometimes even just Fraunhofer.


How does BeeGFS distribute metadata?


BeeGFS distributes metadata on a per-directory basis. This means each directory in your filesystem can be managed by a different metadata server. When a new directory is created, the system automatically picks one of the available metadata servers to be responsible for the files inside this directory. New subdirectories of this directory can be assigned to other metadata servers, so that load is balanced across all available metadata servers.


How to backup metadata?


As extended attributes are typically not copied by default by many backup tools, here are three different ways to backup metadata, which is stored in extended attributes. However, these are just examples. Other tools like rsync also have options to preserve extended attributes and hardlinks and thus could be used to backup BeeGFS metadata.

To reduce the overall runtime, you might want to combine your backup with tools like "xargs" (see parameter "--max-procs") or "GNU parallel" to run multiple processes in parallel, each on a subset of the directory structure.




Additional Notes:





[Installation and Startup]



Where can I find information about why an Admon-based installation of BeeGFS fails?


If you tried to install BeeGFS by using the Administration and Monitoring GUI, there are two log facilities that can provide useful information:


We are not using one of the distributions you are providing packages for, which means we cannot install RPM or DEB packages. What can we do to try BeeGFS?


You can try to download the RPM packages for Suse or Red Hat and unpack them with rpm2cpio and cpio (rpm2cpio packname | cpio -i). Afterwards, you might need to change the init scripts to work with your distribution. We cannot guarantee that BeeGFS will work with your distribution, but it is worth a try.


Do I need to be ''root'' to run BeeGFS?


Yes and no. You do not need root access to start the servers, as these are all userspace processes (of course, you need to change the configuration files to store the file system data and log files at places, where you have write access).
The client is a kernel module. To load the module and mount the file system, you normally need to have root privileges. As an alternative, it is also possible to grant permissions to execute the corresponding commands for non-root users via /etc/sudoers.


Is it possible to run storage servers and clients on the same machine?


Yes, it is. You do not need dedicated hosts for any service in BeeGFS. For example, it is possible to have one host running a management daemon, a metadata server, a storage server and a client at the same time.


Is it possible to run multiple instances of the same daemon on the same machine?


Yes, it is. Starting with the 2012.10 release series, the standard BeeGFS service init scripts (/etc/init.d/beegfs-XY) can manage multiple instances of the same daemon on a machine. To enable support for this, see the comment on MULTI_MODE in /etc/default/beegfs-XY.

For multi-mode, you will need to create a separate configuration file for the other daemon instance, using different network ports, a different storage directory, a different log file and so on. If the second daemon instance on a machine should become part of the same file system instance (i.e. it registers at the same management daemon as the first daemon instance on this machine), then you would also need to set a different NodeID manually for the second daemon instance. (See "Configuration & Tuning" section on this page for information on how to manually set the NodeID.)

For the client, multi-mode is also available, but the client mount config file (/etc/beegfs/beegfs-mounts.conf) also allows specification of multiple mount points without enabled multi-mode, by adding one or more additional lines of mountpoint and corresponding client config file. (Note: Make sure to specify different client ports in the different client config files.)

The BeeGFS-on-demand script (/opt/beegfs/sbin/beegfs-ondemand-v2) from the beegfs-utils package is another possible way to run a separate BeeGFS instance, especially on a per-job basis for clusters or in cloud environments.


Do I need a special Linux kernel for BeeGFS?


BeeGFS client modules require at least kernel version 2.6.18, but apart from that, BeeGFS does not need a special kernel: The client kernel modules were designed patchless (so you don't need to apply any kernel patches and don't even need to reboot to install and run the BeeGFS client) and the server components of BeeGFS run as userspace daemons, so they are independent of the kernel version.


Does BeeGFS need time synchronization between different machines?


Yes, the time of all BeeGFS client and server machines needs to be synchronized for various reasons, e.g. to provide consistent file modification timestamps and for consistent ID generation. Make sure all server clocks are set to the correct time and date (e.g. with date or ntpdate) before starting up BeeGFS services.
A service like ntp can then be used to keep the clocks of all machines in sync.


How to upgrade from one BeeGFS major release series to another?


For an upgrade from version 2009.08 to version 2011.04, see here for instructions.
For an upgrade from version 2011.04 to version 2012.10, see here for instructions.
For an upgrade from version 2012.10 to version 2014.01, see here for instructions.


Why do I get an 'Access denied' error on the client even with correct permissions?


Please check if you have SELinux enabled on the client machine. If it is enabled, disabling it should solve your problem. SELinux can be disabled by setting SELINUX=disabled in the config file /etc/selinux/config. Afterwards, you might need to reboot your client for the new setting to become effective.


How much disk-space is required for metadata?


This depends on things like the average file size that you have or how many files you want to be able to create in total.

In general, we recommend to have about 0.5% to 1% of the total storage capacity for metadata. However, this number is based on gathered statistics of different scientific cluster scratch file systems and thus might or might not fit for your case.

More specifically, for every file that a user creates, one metadata file is created on one of the metadata servers. For every directory that a user creates, two directories and two metadata files are created on one of the metadata servers.

For each directory, one inode and one block (usually 4KB) disk space are used on the underlying local file system there until are so many subentries in the directory that another block needs to be allocated by the underlying file system. How many entries fit into one block depends on things like user file name length, but usually 10+ entries fit into one directory block.
So if a user creates e.g. many directories with only one file inside them, it will significantly increase the used number of inodes and disk space on the underlying local file system.

For file metadata, if the underlying local metadata server file system (e.g. ext4) is formatted according to our recommendations with large inodes (e.g. "mkfs.ext4 -I 512", as described here: Metadata Server Tuning) then the BeeGFS metadata as extended attribute fits completely into the inode of the underlying local file system and does not use any additional disk space. If the underlying metadata server file system is not formatted with large inodes, then the underlying local file system will need to allocate a full block (usually 4KB) to store the BeeGFS metadata information in addition to using up one inode.
Note that xfs has the advantage of using a dynamic number of inodes, meaning new inodes can be created as long as there is free disk space. ext4 on the other hand has a static number of inodes that is defined when the file system is formatted via mkfs (e.g. "mkfs.ext4 -i <number>"). So it can happen with ext4 that you have disk space left but run out of available inodes or vice versa.





[Configuration and Tuning]



Where can I find system tuning recommendations?


There are a lot of tuning possibilities to increase performance - not only in BeeGFS, but also in the Linux kernel, the underlying local file systems, the hardware drivers etc. Please have a look at the Tuning Guide for tips and recommendations on system tuning.


How can I remove a node from an existing file system?


Use the BeeGFS Control Tool beegfs-ctl (contained in the beegfs-utils package) if you need to unregister a server from the file system:

Note: If you want to move files from a storage server to the other storage servers before removing, see here: "Is it possible to free up a storage target before removal?"


Is it possible to free up a storage target before removal?


The beegfs-ctl tool has a mode called "migrate", which allows you to move all files from a certain storage target to other storage targets.

Note: Migration is directory-based and currently single-threaded, so a single migration instance may perform well below the capabilities of the hardware. It is possible to start multiple non-interfering instances of "beegfs-ctl --migrate" on the same client (or different clients) for different directory trees, e.g. one instance for /mnt/beegfs/subdir1 and another instance for /mnt/beegfs/subdir2.


I did some testing and want to start over again. Is there an easy way to delete all residues from the old file system?


To revert your installation to a completely clean file system, you can follow these steps:
  1. Stop all clients and servers (via the Admon GUI or via /etc/init.d/beegfs-X stop)
  2. Delete the data directories of the metadata servers, storage servers and the management daemon (these are the directories named "store...Directory" in the corresponding /etc/beegfs/beegfs-X.conf config files)
  3. Start all servers and clients again


My hosts have more than one network interface. Is there a way of configuring BeeGFS to use only certain interfaces or certain networks?


Yes, there are two different settings that can be used to achieve this:


Is it possible to force a certain node ID or target ID for BeeGFS servers?


First of all you have to know that BeeGFS uses 2 different kinds of IDs for a server node.

  • A string-based node ID. By default, the hostname is used for this.
  • A numeric ID. By default, this ID is randomly generated by the management daemon.
    • Numeric IDs are 16-bit values in range 1..65535.

  • The string-based node ID is most important to the user/administrator, because it is the ID you will see in log messages to conveniently identify the servers. But internally, BeeGFS uses the numeric IDs, mostly because they can be used more efficiently.

    Each BeeGFS server daemon checks for special files inside their storage directory during startup. To force certain IDs instead of having them generated automatically, you would create these special files before first startup of a BeeGFS server daemon.
    The names of these special files are:

    Example:
    Assuming you want to set the string-based node ID of your first storage server to "storage01" and the numeric ID to "1". The first storage server also provides the first storage target, so you would want to set the string-based target ID to "target01" and the numeric target ID to "1". (The storeStorageDirectory in /etc/beegfs/beegfs-storage.conf for this example is set to /mnt/myraid/beegfs_storage.)
    To force these IDs, you would use the commands below before starting up the beegfs-storage daemon for the first time:
    $ echo storage01 > /mnt/myraid/beegfs_storage/nodeID
    $ echo 1 > /mnt/myraid/beegfs_storage/nodeNumID
    $ echo target01 > /mnt/myraid/beegfs_storage/targetID
    $ echo 1 > /mnt/myraid/beegfs_storage/targetNumID

    The ID settings can be confirmed by checking the server log file (/var/log/beegfs-storage.log) after starting up the daemon. Or by querying the management server:
    $ beegfs-ctl --listnodes --nodetype=storage
    $ beegfs-ctl --listtargets --longnodes

    Important notes:


    What needs to be done when a server hostname has changed?


    Scenario: 'hostname' or '$HOSTNAME' report a different name than during the BeeGFS installation and BeeGFS servers refuse to start up. Logs tell the nodeID has changed and therefore a startup was refused.

    Note that by default, node IDs are generated based on the hostname of a server. As IDs are not allowed to change, see here for information on how to manually set your ID back to the previous value: "Is it possible to force a certain node ID or target ID for BeeGFS servers?"


    My client refuses to mount because of an 'unknown storage target'


    Scenario: While testing BeeGFS, you removed the storage directory of a storage server, but kept the storage directory of the management server. Now the BeeGFS client refuses to mount and prints an error about an unknown storage target to the log file.

    What happened to your file system: When you start a new beegfs-storage daemon with a given storage directory, the daemon initializes this directory by assigning an ID to this storage target path and registering this targetID at the management server.
    When you delete this directory, the storage server creates a new directory on next startup with a new ID and also registers this ID at the management server. (Because the storage server cannot know what happend to the old directory and whether you might have just moved the data to another machine, so it needs a new ID here.)

    When the client starts, it performs a sanity check by querying all registered targetIDs from the mangement server and checks whether all of them are accessible. If you removed a storage directory, this check fails and thus the client refuses to mount. (Note: This sanity check can be disabled, but it is definitely a good thing in this case and saves you from more trouble.)

    Now you have two alternative options...

    Solution A: Simply remove the storage directories of all BeeGFS services to start with a clean new file system:

    1) Stop all the BeeGFS server daemons, i.e. beegfs-mgmtd, beegfs-meta, beegfs-storage:
    $ /etc/init.d/beegfs-... stop (or use the Admon GUI).
    2) Delete ("rm -rf") all their storage directories. The paths to the server storage directores can be looked up in the server config files:
    3) Restart the daemons ("/etc/init.d/beegfs-... start" or use the GUI).

    Now you have a fresh new file system without any of the previously registered targetIDs.

    Solution B: Unregister the invalid targetID from the management server:
    For this, you would first use the beegfs-ctl tool (the tool is part of the beegfs-utils package on a client) to list the registered target IDs:
    $ beegfs-ctl --listtargets --longnodes
    Then check the contents of the file "targetNumID" in your storage directory on the storage server to find out which targetID is the current one that you want to keep.
    For all other targetIDs from the list, which are assigned to this storage server but are no longer valid, use this command to unregister them from the management daemon:
    $ beegfs-ctl --unmaptarget <targetID>
    Afterwards, your client will no longer complain about the missing storage targets.

    Note: There are options in the server config files to disallow initialization of new storage directories and registration of new servers or targets, which are not set by default, but should be set for production environments. See storeAllowFirstRunInit and sysAllowNewServers.


    How can I solve problems with the .Xauthority file for home dirs?


    If users have their home directory on BeeGFS, the .Xauthority file for X-forwarding cannot be created correctly, due to missing hardlink support in the BeeGFS 2011.04 release series.
    The solution is to simply create the .Xauthority file on a different file system, e.g. in /var/tmp:

    /usr/local/bin/xauth.sh:
    ========================
    #!/bin/sh
    export XAUTHORITY=/var/tmp/Xauthority-$USER
    /usr/bin/xauth

    /etc/ssh/sshd_config:
    =====================
    XAuthLocation=/usr/local/bin/xauth.sh

    /etc/profile.d/xauth.sh:
    ========================
    export XAUTHORITY=/var/tmp/Xauthority-$USER

    /etc/profile.d/xauth.csh:
    =========================
    setenv XAUTHORITY "/var/tmp/Xauthority-${USER}"



    List of all categories
    Valid XHTML 1.0 Transitional :: Valid CSS :: Powered by WikkaWiki