The OpenNET Project / Index page

[ новости /+++ | форум | wiki | теги | ]

Интерактивная система просмотра системных руководств (man-ов)

 ТемаНаборКатегория 
 
 [Cписок руководств | Печать]

lam (7)
  • lam (1) ( FreeBSD man: Команды и прикладные программы пользовательского уровня )
  • >> lam (7) ( Linux man: Макропакеты и соглашения )
  •  

    NAME

    LAM - introduction to Local Area Multicomputer (LAM)
     
    

    DESCRIPTION

    LAM is an MPI programming environment and development system for a message-passing parallel machine constituted with heterogeneous UNIX computers on a network. With LAM, a dedicated cluster or an existing network computing infrastructure can act as one parallel computer solving one compute-intensive problem. LAM emphasizes productivity in the application development cycle with extensive control and monitoring functionality. The user can easily debug the common errors in parallel programming and is well equipped to diagnose more difficult problems.

    LAM features a full implementation of the MPI communication standard, with the exception that the MPI_CANCEL function will not properly cancel messages that have been sent.  

    SEE ALSO

     

    Overview of Commands and Libraries

    introu(1), introc(2), INTROF(2)  

    Starting / Stopping LAM

    recon(1), lamboot(1), lamhalt(1), lamnodes(1), wipe(1), tping(1), lamgrow(1), lamshrink(1)  

    Compiling MPI Applications

    mpicc(1), mpiCC(1), mpif77(1)  

    Running MPI Applications

    mpirun(1), lamclean(1)  

    Running Non-MPI Applications

    lamexec(1)  

    Monitoring MPI Processes

    mpitask(1), mpimsg(1), lamtrace(1), fstate(1), doom(1), bfctl(1), MPIL_Comm_id(2), MPIL_Trace_on(2)  

    LAM's MPI Implementation

    mpi(7)  

    Reference Documents

    "LAM Frequently Asked Questions"
    at http://www.lam-mpi.org/faq/
    "MPI Primer / Developing with LAM", Ohio Supercomputer Center
    "MPI: A Message-Passing Interface Standard", Message-Passing
    Interface Forum, version 1.1
    at http://www.mpi-forum.org/
    "MPI-2: Extensions to the Message Passing Interface", Message Passing
    Interface Forum, version 2.0
    at http://www.mpi-forum.org/
     

    MPI Quick Tutorials

    "LAM/MPI ND User Guide / Introduction"
    at http://www.lam-mpi.org/mpi/tutorials/lam/
    "MPI: It's Easy to Get Started"
    "MPI: Everyday Datatypes"
    "MPI: Everyday Collective Communication"
     

    GETTING STARTED WITH LAM

    The user creates a file listing the participating machines in the cluster.

    % cat lamhosts
    # a 2-node LAM
    beowulf1.lam-mpi.org
    beowulf2.lam-mpi.org
    

    Each machine will be given a node identifier (nodeid) starting with 0 for the first listed machine, 1 for the second, etc.

    The recon(1) tool verifies that the cluster is bootable.

    % recon -v lamhosts
    recon: -- testing n0 (beowulf1.lam-mpi.org)
    recon: -- testing n1 (beowulf2.lam-mpi.org)
    

    The lamboot(1) tool actually starts LAM on the cluster.

    % lamboot -v lamhosts
    LAM 6.5.6 - University of Notre Dame
    Executing hboot on n0 (beowulf1.lam-mpi.org)...
    Executing hboot on n1 (beowulf2.lam-mpi.org)...
    

    lamboot(1) returns to the UNIX shell prompt. LAM does not force a canned environment or a "LAM shell". The tping(1) command builds user confidence that the cluster and LAM are running.

    % tping -c1 N
      1 byte from 2 nodes: 0.009 secs
    
     

    Compiling MPI Programs

    mpicc(1), mpicp(1), and mpif77(1) are wrappers for the C, C++, and F77 compilers, respectively. They link the LAM libraries and set up header and library search directories. Beginning with LAM version 6.3, the MPI library is also automatically linked to user applications; the use of the -lmpi command line argument is no longer necessary.

    % mpicc -o foo foo.c 
    % mpif77 -o foo foo.f
    
     

    Executing MPI Programs

    An MPI application is started by one invocation of the mpirun(1) command. An SPMD application can be started on the mpirun(1) command line.

    % mpirun -v -c 2 trivial
    2445 trivial running on n0 (o)
    361 trivial running on n1
    

    An application with multiple programs must be described in an application schema, a file that lists each program and its target node(s). See appschema(5).

    % cat appfile
    # 1 master, 2 slaves
    n0 master 
    n0-1 slave
    
    % mpirun -v appfile
    3292 master running on n0 (o)
    3296 slave running on n0 (o)
    412 slave running on n1
    

    Applications can choose, at run-time, to use the "daemon" mode of communication or the "client-to-client" mode. Each has advantages and disadvantages, which are discussed in MPI(7).  

    Monitoring MPI Applications

    The full MPI synchronization status of all processes and messages can be displayed at any time. This includes the source and destination ranks, the message tag, the communicator, and the function invoked.

    % mpitask
    TASK (G/L)           FUNCTION      PEER|ROOT  TAG    COMM   COUNT   DATATYPE
    0/0 trivial          Ssend         1/1        123    WORLD  64      INT
    1/1 trivial          Recv          0/0        456    WORLD  64      INT
    

    Process rank 0 is blocked sending a synchronous message (MPI_Ssend()) to process rank 1 on tag 123 using the MPI_COMM_WORLD communicator. The message contains 64 integers. Process rank 1 is blocked on MPI_Recv() on the same communicator with a different tag.

    % mpimsg
    SRC (G/L)      DEST (G/L)     TAG     COMM    COUNT     DATATYPE    MSG
    0/0            1/1            123     WORLD   64        INT         n1,#0
    

    The unreceived message can be examined with mpimsg(1). The expected tag and communicator are shown, along with a message identifier that can be used to display the message contents.  

    Terminating Applications

    All user processes and messages can be removed, without restarting LAM.

    % lamclean -v
    killing processes, done
    sweeping messages, done
    closing files, done
    sweeping traces, done
    

    This command is frequently used between MPI runs, especially while developing and debugging MPI programs.  

    Terminating LAM

    The lamhalt(1) tool removes all traces of the LAM session on the network.

    % lamhalt
    LAM 6.5.6 - University of Notre Dame
    

    Alternatively, if for some reason lamhalt(1) is not able to shut the running LAM down properly, the deprecated wipe(1) command can be used with the boot schema that was used to originally boot LAM:

    % wipe -v lamhosts
    Executing tkill on n0 (beowulf1.lam-mpi.org)...
    Executing tkill on n1 (beowulf2.lam-mpi.org)...
    
     

    LAM STRUCTURE

    LAM runs on each computer as a single UNIX daemon uniquely structured as a nano-kernel and hand-threaded virtual processes. The nano-kernel component provides a simple message-passing, rendez-vous service to local processes. Some of the in-daemon processes form a network communication subsystem, which transfers messages to and from other LAM daemons on other machines. The network subsystem adds features like packetization and buffering to the base synchronization. Other in-daemon processes are servers for remote capabilities, such as program execution and parallel file access. The layering is quite distinct: the nano-kernel has no connection with the network subsystem, which has no connection with the servers. Users can configure in or out services as necessary.

    The unique software engineering of LAM is transparent to users and system administrators, who only see a conventional daemon. System developers can de-cluster the daemon into a daemon containing only the nano-kernel and several full client processes. This developer's mode is still transparent to users but exposes LAM's highly modular components to simplified individual debugging. It also reveals LAM's evolution from Trollius, which ran natively on scalable multicomputers and joined them to the UNIX network through a uniform programming interface. Trollius is the ultimate heterogeneous parallel environment.

    The network layer in LAM is a documented, primitive and abstract layer on which to implement a more powerful communication standard like MPI.  

    Debugging

    A most important feature of LAM is hands-on control of the multicomputer. There is very little that cannot be seen or changed at runtime. Programs residing anywhere can be executed anywhere, stopped, resumed, killed, and watched the whole time. Messages can be viewed anywhere on the multicomputer and buffer constraints tuned as experience with the application dictates. If the synchronization of a process and a message can be easily displayed, mismatches resulting in bugs can easily be found. These and other services are available both as a programming library and as utility programs run from any shell.  

    MPI Implementation

    MPI synchronization boils down to four variables: context, tag, source rank, destination rank. These are mapped to LAM's abstract synchronization at the network layer. MPI debugging tools interpret the LAM information with the knowledge of the LAM/MPI mapping and present detailed information to MPI programmers.

    A significant portion of the MPI specification can be (and is) implemented completely within the runtime system and independent of the underlying environment.

    As with all MPI implementations, LAM must synchronize the launch of MPI applications so that all processes locate each other before user code is entered. The mpirun(1) command achieves this after finding and loading the program(s) which constitute the application. A simple SPMD application can be specified on the mpirun(1) command line, while a more complex configuration is described in a separate file, called an application schema.

    MPI programs developed on LAM can be moved without source code changes to any other platform that supports MPI.


     

    Index

    NAME
    DESCRIPTION
    SEE ALSO
    Overview of Commands and Libraries
    Starting / Stopping LAM
    Compiling MPI Applications
    Running MPI Applications
    Running Non-MPI Applications
    Monitoring MPI Processes
    LAM's MPI Implementation
    Reference Documents
    MPI Quick Tutorials
    GETTING STARTED WITH LAM
    Compiling MPI Programs
    Executing MPI Programs
    Monitoring MPI Applications
    Terminating Applications
    Terminating LAM
    LAM STRUCTURE
    Debugging
    MPI Implementation


    Поиск по тексту MAN-ов: 




    Спонсоры:
    PostgresPro
    Inferno Solutions
    Hosting by Hoster.ru
    Хостинг:

    Закладки на сайте
    Проследить за страницей
    Created 1996-2022 by Maxim Chirkov
    Добавить, Поддержать, Вебмастеру