/************************************************************************ * * COSI 220 -- Port Redirector * Homework #2 * Jeremy Gilbert * March 11, 1998 * * proxy.c * * COPYRIGHT(C) 1998 Jeremy Gilbert * ************************************************************************ * * This is a simple TCP port redirector which functions with the * following syntax: * * % proxy * * While running, this process listens at for incoming * TCP connections on the host. Every time an incoming connection * arrives, a new connection from the host is initiated to * at port . The proxy sits in a loop * copying input back and forth passing data through the connection. * * * OVERVIEW * * This proxy is implemented entirely in a single thread. Select(2) is * used to multiplex the connection for multiple clients. All * potentially blocking I/O is preformed on non-blocking file * descriptors (set using fcntl.) This style of implementation is * much faster than using fork(2) to multiplex the connections since * it avoids the high overhead of process of process creation. On a * single processor machine, R. W. Stevens notes that this type of * server with select(2) and nonblocking I/O is more effecient for * this type of echoing operation than using fork or even * threads. (See section 15.2, page 409) In fact, the fastest * implemtnation of an echo service given in the Stevens book uses * this model. The other advantage of a single thread implementation * is that its design will make lots of things easier for us later on * if we want to add any sort of caching or other operations which * would require communication between multiple threads. * * Dealing with select(2) and asychronous non-blocking I/O is a lot of * work and requires a rather non-trivial implementation. We must * carefully maintain buffers and FIFOs so that we can handle all * possible syscall interruptions and potentially partial reads and * writes. The select(2) model also involves the maitainance of a * variable number of producer/consumer pairs which must remain * simultaneously active. * * * DATA STRUCTURES AND GENERAL ALGORITHIM * * As previously alluded, there is a significant engineering problem * involved in multiplexing non-blocking I/O with select(2). At any * given time, we can think of two reader/writor pairs for each proxy * connection. The first pair takes data from the incoming client and * writes it to the destination. The second pair reads from the * destination and connects it to the incoming client. We can think * of these directional flows as "network pipes", or "netpipes." For * each reader/writer pair/netpipe, it is possible there will be * pending data which can be read while the writer end is unavailable. * In this case, we must carefully enqueue the data into a FIFO every * time we read(2). Similairly, when the writing end is available, we * must dequeue block in order and write them out. Handling this mess * in practice requires very careful modification of the fd vectors we * pass to select to make sure only the smallest set of truly * available sockets will wake the process up. (There is no point in * responding to a write ready if there is nothing to write.) * * This is all further complicated by the fact that we can obtain * short reads and writes. In the case of a short read, no harm is * done -- each block of data in the queue is simply the size of * whatever we were able to get from read(2). However, in the case of * write(2) we cannot completely dequeue a block unless the entire * block is written. This means several exception cases must exist * when a block is available for writing. * * For example, imagine that there are 5 seperate 8.0K chunks of data * in a particular fifo, and the writing end becomes available. If * write(2) only dumps 5K into the output channel, the remaining 3K of * the 5th chunk must stay in the queue until the port is readable * again. To handle this situation, we use a special block_q data * structure for each chunk. This structure keeps track of the number * of bytes remaining to be written so that a short write(2) of the * chunk can be restarted. Only when all of the chunck stored in the * block_q is written can the item be fully dequeued. * * In this code, a special struct called a netpipe_t is created for * each producer/consumer action. Each proxy client will have TWO * netpipes structs allocated at any given time to handle data flowing * in both directions. This netpipe struct tracks the file * descriptors for each end of the directional pipe. (Each actual * connecting proxy requires TWO of these structes for both directions * of data flow.) The netpipe_t structure contains a pointer to a * linked list of block_q elements which serve as the fifo for data * that has been read but not yet written. * * * BASIC OPERATION WITH SELECT * * The basic operation of the program is as follows. First, arguments * are read in and the remote destination is looked up. Secondly, a * new socket is created called 'ls' which will serve as our listening * socket for all new connections. Various socket options are set, a * signal handler is installed and various data structures are * initialized. * * The main loop of the program calls the select(2) system call to * figure out what file descriptors are ready. Two file descriptor * vectors, rset and wset specify the "interesting" file descriptors * at any given time. At first, wset is cleared and rset is only set * on the listening file descriptor 'ls', which will become readable * when an incoming client connects. Now select(2) is called from the * main loop, and the program goes to sleep until one of the * "interesting" file descriptors becomes ready. Copies of rset and * wset are made in trset and twset and passed to select. * * Eventually select(2) will wake the process up, and return sets of * readable and writeable file descriptors in trset and twset * respectivly (blowing away the previous values of the fd vectors.) * Upon return from select, the code now must check all of the file * descriptors it cares about to see if they were ones that actually * woke the process up. * * First, we check if the 'ls' socket is readable. If it is, a new * connection is pending on the listening socket and we can call * accept(2) to open it. The actual new client process is described in * better detail below. * * If there are still unhandled file descriptors after we have checked * for a new listener connection, the program now loops through the * active portions of the netpipe[] array, checking for ready file * descriptors on the in or out ends for each direction of data flow. * If the input file descriptor has become readble according to trset, * we call read(2) on it. The read(2) call will return a section of * new data from the input socket which we enqueue in the * netpipe[].fifo data structure for later use. If we have a * completly short read, we are at EOF, and therefore we expect no * further data from the client. We then take steps to shutdown the * writing end and mark the netpipe as closed. Otherwise, since there * is new data that must be written out, we make sure that the writing * end of this netpipe_t has been marked to select(2) as * "interesting." Now, whenever the select(2) call discovers the * writing end can be written to, we will be notified and this block * can be written out. If multiple blocks come in before any are * written out, they are enqueued in the fifo. * * If we discover that the writing end is available according to * twset, a similiar process occurs. We locate the oldest waiting * block from the output end of the queue associated with that * netpipe_t and issue a write(2) on the block. If the entire block * does not get written, we adjust an internal pointer in the queue so * that the element will be revisted next time and the remainder * written. Otherwise, if the entire block did get written out, we * dequeue it and free(3) the struct and the associated block. If * there is nothing else in the writing queue, we unset our entry in * wset so that no further write-ready events are marked to us. Only * when there is more data enqueue by the reading procedure will the * write handler be activated in the fd vector. * * One slight peice of inefficiency occurs here: if we suceed in * writing out an entire block from the queue, it is possible that we * could go ahead and dequeue another block and try writing that one * too. For simplicity's sake, we don't bother. The writable flag in * the select write vector remains set in this case, and the next call * to select will inevitably call the code again to try another * block. This has very little effect on actual effeciency since on * the average connection the size of blocks read are likely to be the * same size as those that can written fully. * * In this way, each netpipe_t directional pipe is serviced at its * reader and writer ends ensureing that data flows through in order * and completely without blocking. * * * INITIATING CONNECTIONS * * In the main loop, after accept(2) returns, we begin preparations to * connect to the remote side and call connect(2). One connect * returns, we initialize TWO new netpipe structures to service the * bidirectional flow of information. We initialize these structures * so that one of them moves data from the incoming fd to the outgoing * fd, and another moves it in the opposite direction. * * All sockets opened by accept and the client connect(2) calls are * set O_NOBLOCK using fcntl(2) to avoid any blocking that would * suspend the entire process. If a read(2) or write(2) operation is * 'short' due to a full TCP buffer or the like, we receive * EWOULDBLOCK and figure out how much of the system call was actually * completed, instead of the process blocking entirely. This allows * the program to manage its resources carefully and not be at the * mercy of the blocking nature of standard I/O. * * * CLOSING CONNECTIONS * * When either the reader or writer code detects an EOF condition, the * proper cleanup operations must occur. There are three basic * possibilities: * * If the reader has sent FIN, then a special flag called rclose is * set to indicate that there will be no more data coming down the * line from that direction. The code then checks if the fifo is * empty, and if so, it goes ahead and shuts down (sends FIN) to the * output end of the pipe since there will be no more data. Once this * occurs, the wclose is also set. * * Otherwise, the writer code checks to see if rclose is set when it * writes the last block out. If the buffer has just cleared and the * reader has issued EOF, then the writer can issue EOF too. Wclose is * set and shutdwon is set. * * A final possibility is that the shutdown will have occured on the * writers end. In this case, there is data in the queue that will * never be written out. We purge all of it in a loop and set rclose * and wclose for the pipe. * * A special check is made for each netpipe to determine if the * netpipe is both rclose and wclose. In this case the netpipe is * deallocated from the system and the two file descriptors are * formally closed. * * The preceding code only handles a partial shutdown of the * socket. It is not appropriete for the netpipe to fully destroy the * involved sockets at termination since there may be one other * netpipe sitting somewhere which may be still using it. Therefore, * netpipes know their opposing pipe ID, and when they close out they * check to see if they are the last of the pair to finish. If this is * the case, they can safely perform the close on the involved file * descriptors and make sure they are removed from rset and wset. This * also provides a mechanism for maintaining connection counts. * * * * TESTING * * This code has been tested with every common client/server * combination I could think of. I have tested the maximum client * functionality, the blocking of reads and writes, shutdowns from * both ends of the server, and numerous other boundary cases. It has * been used to move many multiple streams two and from various * network daemons without any difficulties. * * * KNOWN BUGS * * The client's connect(2) call may still block the process, although * in practice this is relativly rare. The only way to avoid this is * to engage in a massive codeing project to handle mutli-state client * initiations which would not be portable across all UNIX derrivites. * * Under very rare conditions (Stevens 15.6) accept(2) can block the * process even though we avoid calling it until select(2) tells us * that descriptor is read -- for instance if the client sends RST * after a connect, additional handshaking is required. The workaround * for this problem is not as difficult as for connect(2), but may * require a non-local goto in the processing loop. * * Also, the code may compile cleanly only on POSIX.1g complient systems. * * * OTHER COMMENTS * * This source code is formatted using BSD source indentation (of * course.) * *********/ /* Define constants */ /* The most number of bytes of a bufer to print out in SHOWDATA mode. See print_buffer comments for more details */ #define MAXPRINTSIZE 10 /* The most number of clients let listen(2) keep waiting. On high load machine, set this up high */ #define MAXLISTEN 10 /* The maximim size of the read/write buffer. Make this high too. */ #define BUFSIZE 2000 /* This defines the maxmimum number of proxy connections we can have at any time */ #define MAXCLIENTS 200 /* We have to define this POSIX crap since 1.0g isn't mainstream. */ #define SHUT_RD 0 #define SHUT_WR 1 /* Debugging Constants */ /* Prints the actual data as it is written */ #define SHOWDATA 0 /* If DEBUG is set, we get lots of nice status information printed out */ #define DEBUG 0 /* Program Structures */ /* The block_q structure is used to create FIFOs. A chain of these may * exist at any given point for each direction data must flow. In this * struct, we keep track of the original handle and the block pointer * seperatly since the block pointer may be incremented in the event * of a partial write, and we have to maintain a copy of the original * block for the purposes of free(2)ing it later. */ typedef struct block_q { char *block; /* A pointer to the begining of the readable block */ char *handle; /* The original pointer for use with free(3) */ int size; /* The size of remainder of the buffer */ struct block_q *next; /* The next guy in the list */ } block_q; /* Each active client has a netpipe_t struct which contains all of its * control information. A "client" is actually two netpipes, since * two of these structures will be created for each proxy connection * for both directons that data must travel. The netpipe struct * contains all the information neccesary to move data from one file * descriptor to another. It maintains the file descriptors on both * ends (the reading end and the writing end) and it maintains a * block_q strucutre. During the select(2) loop, a producer/consumer * arrangment will add and subtract sections of data to the fifo. */ typedef struct netpipe_t { int used; /* A flag if the entry is used */ int infd; /* The fd for the reading end */ int outfd; /* The fd for the proxy redirect */ struct block_q *fifo; /* Pointer to the begining of the forward pipe */ int pair_id; /* An index to the other netpipe in the pair */ int rclose; /* Indicates that the reading end is closed */ int wclose; /* Indicates that the writing end is closed */ } netpipe_t; /* Standard includes for network stuff */ #include #include #include #include #include #include /* For debug() */ #include #include /* For strdup(3) */ #include /* Quite useful for debugging */ #include #include /* For nonblocking */ #include #include #include #include #include /* For timeval in select */ #include /* sockaddr_in and other fun stuff */ /* The following is only needed to support the broken resolve_name * function imported from client.c. Posix 1.0g defines a better way to * do this. */ #include #include #include #include #include /* End of program includes */ /* exit_error: A small function to call perror(3F) with the given * argument and exit with an error status. Essentially a wrapper for a * perror and an exit. The first argument is passed to * perror(3F). This function does not return. */ void exit_error( char *s ) { perror( s ); exit(-1); } /* print_buffer: A utility function to print out to the screen a small * section of a buffer for debugging purposes. The function takes a * pointer to a buffer and a size. If the size is greater than the * constant MAXPRINTSIZE, only the first MAXPRINTSIZE bytes will be * printed. The output of this function show the hexadecimal * representation of the buffer following by the ASCII representation. * All non-printables are shown with a '.'. This function has no * return value. */ void print_buffer( char *buf, int size ) { int i; /* A looping variable */ /* Check to see if the buffer is bigger than MAXPRINTSIZE and if * so, limit the actual size printed. This is a good idea if * buffers can be many kilobytes and you only want to see the * first few characters. */ if( size > MAXPRINTSIZE ) size = MAXPRINTSIZE; printf( "[ " ); /* An opening delimiter */ /* Loop over the relevant part of the buffer and print out hex * values. We have to coerce an unsigned char so the compiler * passes the right stuff to printf */ for( i=0 ; ih_name)) >= canonl) ? canonl-1 : n ; memcpy(canonp, hp->h_name, n) ; canonp[n] = '\0' ; return( *( u_long *) hp->h_addr) ; } } /* gd_time: A utility to print the time in microseconds. Copies from * page 404 of Stevens UNIX Network Programming book. This code * accepts no arguments and returns a character string containing the * current time in hours, minutes, seconds and microseconds.*/ char *gd_time(void) { struct timeval tv; static char str[30]; char *ptr; if( gettimeofday(&tv, NULL) < 0 ) exit_error( "gettimeofday" ); ptr = ctime(&tv.tv_sec ); strcpy(str,&ptr[11]); snprintf( str + 8 , sizeof(str) - 8 , ".%06ld", tv.tv_usec ); return(str); } /* debug: A neat utility function that works just like printf except * that it writes a nice timestamp along with the text, and will only * print if DEBUG is true. The semantics of debug() are identical to * printf. This function has no return value (unlike printf(3)) */ void debug( char *fmt, ... ) { if( DEBUG ) { va_list args; va_start(args,fmt); printf( "%s : ", gd_time() ); vfprintf( stdout, fmt, args ); va_end(args); } } /* find_next_netpipe: A very quick function to scan the netpipe array * and pick the first unused slot. Since we have to do this twice for * each proxy connection, it makes sense to have this be an external * function. We check to be sure that a netpipe is found -- if there * is no more space for a netpipe, then we error out with a fatal exit * code. There should be other checks in the program to not accept any * new connections if a certain maximum of existing connections is * exceeded and this should avoid the fatal case here. This function * takes a single input parameter the array of netpipes. It returns * the index of the first available one. */ int find_next_netpipe(netpipe_t *netpipe) { int i; for( i = 0 ; i < FD_SETSIZE ; i++ ) if( netpipe[i].used == 0 ) return i; /* If we are still here, a slot could not be found. Error out. */ return -1; } /* MAIN Program Code: Most of the functionality of the program is in * the main loop. There are few reasons to make any of the code * modular, and doing so would require the introduction of global * state variables. */ int main(int argc, char *argv[]) { /* Arguments and Enviroment Variables */ char *progname; /* From argv */ char *redirect_addr; /* From argv */ int redirect_port; /* From argv */ int listen_port; /* From argv */ char redirect_addr_can[128]; /* Provided to us by resolve_name */ u_long redirect_addr_ip; /* IP of destination, from resolve_name */ /* Listening Socket Variables */ struct sockaddr_in sin; /* Contains our main listening socket */ int ls; /* Our listening socket file descriptior */ /* Variables for network and select(2) semantics.*/ int i; /* Netpipe_T iterator for forwards pipe */ int j; /* Netpipe_T iterator for backwards pipe */ int maxfd; /* Tracks the maximum file descriptor for our calls to select */ fd_set rset; /* The standard set of reading fds */ fd_set wset; /* The standard set of writing fds */ fd_set trset, twset; /* Temporary copies of rset and wset we actually pass to select (select will munge the originals we pass it. */ char iobuff[BUFSIZE]; /* A buffer for read(2) and write(2) */ /* Netpipe state information -- For each client, we store * information on the two directions dast must flow in a struct of * type netpipe_t. This struct maintains the file descriptors, * pointers to buffers, status and availability information, and * the like for each direction that our loop must move data. We * assume that since there is a certain maximum number of file * descriptors which sets the effective bounds for the number * of netpipes we can handle, the constant FD_SETSIZE is a * good size for the array. */ netpipe_t netpipe[FD_SETSIZE]; int maxi = 0; /* The index of the highest used index of netpipe[] */ int numclients = 0; /* The number of active client connections */ /* End of Variable Declarations */ /* Main program begins here */ /* Initilize the netpipe[] structs. We don't have to bother with * anything other than the 'used' field since the code to handle * the accept will have to initialize these field anyway. */ for( i = 0 ; i < FD_SETSIZE ; i++ ) { netpipe[i].used = 0; } /* Parse and check the arguments, and fill in the three argument * variables in the main program for future use. Else, exit with * an error to the user. */ progname = strdup( argv[0] ); /* Useful for error messages and the like */ /* Make sure we got three arguments and they are all valid. */ if( argc != 4 || argv[1] == NULL || argv[2] == NULL || argv[3] == NULL ) { fprintf( stderr, "Usage %s: \n", progname ); exit(-1); } /* Copy stuff from argv into our local variables, converting the * port strings into numbers for the sock_addr */ redirect_addr = strdup( argv[1] ); redirect_port = atoi( argv[2] ); listen_port = atoi( argv[3] ); /* Check for crazy or out of bounds port numbers. If the user * specifies any reserved port numbers and is not root, this will * be discovered when we do a bind(2). */ if(!( redirect_port > 0 && listen_port > 0 )) { fprintf( stderr, "%s: Either dest_port or listen_port are invalid\n", progname ); exit(-1); } /* Attempt to resolve the address given to us for the destination * of the proxy. There is no point in continuing if we cannot do * this, and this will safe future lookups later. */ redirect_addr_ip = resolve_name( redirect_addr, redirect_addr_can, 128 ); /* Check if we actually got a valid network address, and if not, * compalin to the user that the hostname cannot be resolved * properly. */ if( redirect_addr_ip == 0 ) { fprintf(stderr, "%s: Cannot resolve hostname '%s'\n", progname, redirect_addr ); exit(-1); } /* Under extremely rare conditions, a write(2) may occur to a * socket which has recently been disconnected. This can happen if * the redirector is heavily loaded and the remote server drops * the connection unexpectedly. To prevent the resulting SIGPIPE * signal from exiting the program we trap it here and ignore * it. */ signal( SIGPIPE, SIG_IGN ); /* Open a socket for listening. We will use this socket 'ls' to * receive new connections. */ if((ls = socket(PF_INET, SOCK_STREAM, 0)) < 0) exit_error( "socket" ); /* We set the REUSEADDR socket option on 'ls' so we can ignore * problems with previous connections still in TIME_WAIT. */ { int n = 1; if( setsockopt( ls, SOL_SOCKET, SO_REUSEADDR, &n, sizeof(n)) < 0 ) exit_error( "setsockopt(SO_REUSEADDR)" ); } /* Bind my socket address to a connection after we fill out sin */ sin.sin_family = AF_INET; sin.sin_addr.s_addr = htonl(INADDR_ANY); /* anyone can connect */ sin.sin_port = htons(listen_port); if( bind(ls, (struct sockaddr *) &sin, sizeof(sin)) < 0) exit_error( "bind" ); /* Now call listen on the socket, allowing MAXLISTEN pending * connection requests */ if( listen(ls, MAXLISTEN) < 0) exit_error( "listen" ); /* Now that we are ready to start the main loop, we can notify the * user what we plan to do before we enter the loop. */ fprintf( stderr, "%s started, listening at port %d and redirecting to %s:%d\n", progname, listen_port, redirect_addr_can, redirect_port ); /* When we start iterating using select(2), we will have to give * the system call a set of fds we care to know about for reading * and for writing. Since select(2) will actually alter our copy * of the set variable, we must always pass select(2) a copy of * these sets. Our general strategy is to keep the set of * "interesting" fds in the variables rset and wset and make * copies of these variables in trset and twset. We initialize * these sets below before we start iterating. Right now, the * only file descriptor we care about is 'ls', since no proxy * connections are in progress and incoming clients may wish to * connect to us. */ FD_ZERO(&rset); /* Clear out the set */ FD_ZERO(&wset); /* Clear out the set */ FD_SET( ls, &rset ); /* We care only for pending read on our listener socket, in which case we will do an accept */ /* Also during our big loop, we keep track the maximum file * descriptor so we can tell select how many FDs we care about. * Every time we are given a new fd we care about, we carefully * see if this must be incremented. Strange things will happen if * we neglect this. Since there are no new clients or * connections, the maximum fd is by necessity the incoming * listener socket. */ maxfd = ls; /* Now comes the big infinite loop which handles all of our * network I/O. The basic structure is as follows: We create * copies of rset and wset and pass them to select. When select * next returns, it returns to us the actual number of ready file * descriptors. First, we check if a new connection is available * on 'ls'. If it is, we initiate a new client connection and set * up two netpipes to handle the flow of data. Now, we iterate * over all of the 'active' netpipes and see if either their * reading or writing ends are available, and take whatever action * is neccessary. In the case of more available reads, we pull in * data and add it to the buffer. In the case of an available * writes, we dequeue data from the buffer and attempt to write it * out. We also check for closed file descriptors, and if we get * any, we initiate shutdown(2) on the opposing port. We maintain * a variable called 'nready' which tracks the number of FDs we * have actually serviced. If at any point we notice that nready * is zero, we know we have serviced everything. Then we can omit * certain further FD checks and go back to the next select(2) * call. */ for(;;) { int nready; /* The number of fds ready */ debug( "====================== Blocking on Select, %d clients\n", numclients ); /* Copy the template sets into our temporary sets for * select. Select will munge these. */ trset = rset; twset = wset; /* Call select to find out what descriptors are ready. */ nready = select( maxfd + 1, &trset, &twset, NULL, NULL ); if( nready < 0 ) exit_error( "select" ); debug( "Select indicates %d ready descriptors.\n", nready ); /* Check the status of 'ls' to see if this is a new client * connection. If so, set up the new proxy by allocating two * netpipes to it and opening a connection to the proxy * destination. */ if( FD_ISSET(ls, &trset ) ) { int connfd = -1; /* A FD to hold any newly accepted sockets */ int val; /* Temporary value for fcntl(3) */ struct sockaddr_in cin; /* connecting socket info */ int cfd = -1; /* A FD to hold outgoing connection sockets */ struct sockaddr cliaddr; /* For accept */ int clilen = sizeof( cliaddr ); /* For accept */ /* IF numclients > MAXCLIENTS, don't do anything, error out */ debug( "Notified by select of new connection.\n" ); /* Now we can call accept(2) and establish the new connection * to the incoming client. */ connfd = accept( ls, &cliaddr, &clilen ); if( connfd < 0 ) exit_error( "accept_client" ); debug( "Accepted connection, connfd=%d.\n", connfd ); /* The very first thing we should do is use fnctl(2) to * add O_NONBLOCK to the other existing control flags for * connfd. This will make our read(2) and write(2) system * calls nonblocking, which is neccessary for proper * asnychronous I/O. Instead of blocking, they will return * EWOULDBLOCK. * * The way we do this is to query the descriptor for the * existing flags and placing them into val. Then we set * the flags on the descriptor to val ORed with * O_NONBLOCK. */ val = fcntl( connfd, F_GETFL, 0 ); if( val < 0 ) exit_error( "fcntl(GETFL)" ); if( 0 > fcntl( connfd, F_SETFL, val | O_NONBLOCK )) exit_error( "fnctl(SETFL)" ); /* Now we attempt to issue a connect to the destination * address for the proxy. We begin this by filling out a * 'cin' structure. */ cin.sin_family = AF_INET; cin.sin_addr.s_addr = redirect_addr_ip; cin.sin_port = htons(redirect_port); /* Create a socket of type stream for the destination side */ if ((cfd = socket(AF_INET, SOCK_STREAM, 0)) < 0) { exit_error("main: socket failed"); } /* Call connect on the client socket cfd. */ if (connect(cfd, (struct sockaddr *) &cin, sizeof(cin)) < 0) { exit_error("main: connect failed"); } /* Now set O_NONBLOCK on the cfd the same way we did for * connfd above. This ensures that neither direction is * blocking. */ val = fcntl( cfd, F_GETFL, 0 ); if( val < 0 ) exit_error( "fcntl(GETFL)" ); if( 0 > fcntl( cfd, F_SETFL, val | O_NONBLOCK )) exit_error( "fnctl(SETFL)" ); debug( "Connection to remote host established on fd %d\n", cfd ); if(( i = find_next_netpipe(netpipe) ) < 0 ) { fprintf(stderr, "%s: Too many netpipes!\n", progname ); exit(-1); } debug( "Initializing netpipe entry %d for forward pipe.\n", i ); /* Initialize the netpipe[] structure with default values */ netpipe[i].used = 2; netpipe[i].infd = connfd; netpipe[i].outfd = cfd; netpipe[i].fifo = NULL; netpipe[i].rclose = 0; netpipe[i].wclose = 0; if(( j = find_next_netpipe(netpipe) ) < 0 ) { fprintf(stderr, "%s: Too many netpipes!\n", progname ); exit(-1); } debug( "Initializing netpipe entry %d for backward pipe.\n", j ); /* Initialize the netpipe[] structure with default values */ netpipe[j].used = 2; netpipe[j].infd = cfd; netpipe[j].outfd = connfd; netpipe[j].fifo = NULL; netpipe[j].rclose = 0; netpipe[j].wclose = 0; /* Set set each netpipe's pair_id to the other guys. This * will be helpful later during cleanup since we don't * know which netpipe will be the one to have to actually * close both ends of the file descriptors allocated to * both of them. */ netpipe[i].pair_id = j; netpipe[j].pair_id = i; /* Now we flag reading events as "interesting" to the * select system call. We don't flag writing events as * interesting yet, because we don't have anything to * write yet. */ FD_SET( connfd, &rset ); FD_SET( cfd, &rset ); /* Now that we have been issued a new file descriptor by * the kernel, check to see if we now have to increment * maxfd. Having an up-to-date value for maxfd is * important for the select(2) system call to work * properly since it must always know the highest FD that * we care about. */ if( connfd > maxfd ) maxfd = connfd; if( cfd > maxfd ) maxfd = cfd; /* For efficiency, we keep track of the highest * initialized netpipe[] element in a variable called * 'maxi'. When we loop over all of the netpipes later on, * we can know the highest index we need to iterate to. */ if( i > maxi ) maxi = i; if( j > maxi ) maxi = j; /* Finally, increment our 'numclient' variable. This * gives us a fast way to check if we can allow any more * accepts. */ numclients++; fprintf( stderr, "New client started\n" ); /* One of the things we need to check is to see if we have * exceeded our maximum number of clients. If this * happens, we take our listening socket off of the list * of file descriptors so that we will not call accept on * it anymore. Later, when numclients is lower again, we * can add it back on. */ if( numclients >= MAXCLIENTS ) FD_CLR( ls, &rset ); /* Check to see if we have just handled the last ready * descriptor. If we have, we can avoid any further file * descriptor testing. */ if( --nready <= 0 ) continue; /* No more readable descriptors */ debug( "End of listen code\n"); } /* End of listen/fd checking code */ /* Now we iterate through all of the active netpipes and check * if any of their file descriptors have become ready. We only * have to check up to netpipe[maxi] since indexes above this * value have not been touched yet. */ for( i = 0 ; i <= maxi ; i++ ) { int infd, outfd; /* Local copies to make the code cleaner */ if(!netpipe[i].used) continue; /* If this slot isn't being used, skip it */ /* debug( "Checking netpipe %d of %d\n", i, maxi );*/ /* We make local copies of infd and outfd to make life * easier for us later on and so the code is cleaner. */ infd = netpipe[i].infd; outfd = netpipe[i].outfd; /* First, test if the READ end of this pipe is ready and * handle it if this is the case */ if( FD_ISSET( infd, &trset )) { int n; /* Used for call to read(2) */ debug( "Netpipe %d infd %d is ready for reading\n", i, infd ); debug( "Netpipe %d has fifo %p\n", i, netpipe[i].fifo ); /* Issue the actual call to read(2). When we call read(2), there are three basic possibilities. Either there is some sort of error (n<0), there is an EOF condition (n == 0) or a buffer of data has been read in (n>0). Test for each case */ if((n = read( infd, iobuff, BUFSIZE )) < 0 ) { /* In this case N<0 and there is some sort of * error. If the error is EWOULDBLOCK, we * actually don't care. But we should never get * EWOULDBLOCK since we would not have called * read(2) if select(2) had not indicated to us * that there was pending data. */ if( errno != EWOULDBLOCK) exit_error( "read error on socket" ); debug( "Wierd -- should not be here\n" ); } else if ( n == 0 ) { /* This is the case where there is a pending * EOF. According to Stevens, there is only one * situation where select(2) will indicate a * socket is available for writing but that * read(2) returns a zero value of bytes * written. In this case, the sending host on infd * has indicated an EOF condition. Once the buffer * clears, it will be safe to shutdown this * netpipe. */ debug( "read: netpipe %d, sock %d indicates EOF\n", i, infd ); /* Indicate to the kernel that we no longer wish to be woken up for read events on this pipe. */ FD_CLR( infd, &rset ); /* Mark the netpipe as being rclosed. Now, when * the write queue clears, we can close the * netpipe completely. */ netpipe[i].rclose = 1; /* Check to see if the fifo is empty. If it is, * then we can close the write end of the netpipe * too. Otherwise, when the consumer end of the * netpipe notices that there is nothing more to * send, it will clean this stuff up for us. */ if( netpipe[i].fifo == NULL ) { debug( "Closing write end since buffer is clear\n" ); /* Shutdown the write end */ if( shutdown( outfd, SHUT_WR ) < 0 ) /* writing disallowed, sends FIN */ exit_error( "shutdown outfd EOF" ); /* We can go ahead and immediately mark this * netpipe as wclose. It will get reaped by * the cleanup code later on. */ netpipe[i].wclose = 1; /* Indicate to the kernel that we no longer * wish to be woken up for write events on the * opposite end of the pipe. */ FD_CLR( outfd, &wset ); } } else { /* This is the case where n>0 and we have read * some number of bytes in from infd. The idea * here is to enqueue this data into * netpipe[i].fifo so that it will be written out * when the destination pipe becomes ready. * Hence, we have a reader/writer sitation between * the code that handles the read(2) on infd and * the code that handles the write(2) on outfd. */ block_q *b = NULL; /* This will be our new block_q item */ char *stuff; debug( "read %d bytes from socket %d\n", n, infd ); #if SHOWDATA debug( "ENQUEUEing into pipe %i: ", i); print_buffer( iobuff, n ); #endif /* Allocate new space for the block_q struct */ b = (block_q *)malloc(sizeof(block_q)); if(!b) exit_error( "block_q malloc" ); /* Allocate new space to copy iobuff[] into the * block_q */ stuff = (char *)malloc(n*sizeof(char)); if(!stuff) exit_error( "stuff malloc" ); /* Now we can go ahead and copy iobuff[] into this * 'stuff' variable that contains the block of * data headed out for the other end of the * pipe. */ if(stuff != memcpy( stuff, iobuff, n )) exit_error( "enqueue memcpy" ); debug( "Created new block=%p, buffer=%p\n", b, stuff ); /* debug( "Existing queue value = %p\n", (netpipe[i].fifo)); */ /* Fill out the entries in the new block_q */ b->size = n; b->block = stuff; b->handle = stuff; /* Insert 'b' into the head of the list */ b->next = (netpipe[i].fifo); (netpipe[i].fifo) = b; /* debug( "New queue value = %p\n", (netpipe[i].fifo));*/ /* Now that we have pending output, make sure that * we are woken up when the write pipe becomes * ready. */ FD_SET( outfd, &wset ); } } /* Now we check to see if the WRITE end of the pipe is * ready, and handle this case. */ if( FD_ISSET( outfd, &twset )) { /* Now we must call write(2) on the tail of the fifo * queue and figure out where to go from there. */ block_q *b = (netpipe[i].fifo); /* Used to iterate through the fifo */ block_q *prev = NULL; /* Contains the previous entry to b */ int n; /* To hold the number of bytes written */ debug( "Netpipe %d outfd %d is ready for writing\n", i, outfd ); /* Now figure out what the LAST element of the fifo * is. Sit in a while loop until we reach the end of * the list. We also care about the previous element, * since if it comes to pass that we must dequeue the * last element of the list, it will be neccessary to * update the "next" pointer in the previous element * to NULL. (This will reflect that the previous * element is now the new "end" of the list. ) */ assert( b != NULL ); while((b->next != NULL )) { /* printf( "Traversing %p, prev=%p\n", b, prev ); */ prev = b; b = b->next; } debug( "Found block %p with %d bytes left\n", b, b->size ); /* Now, we have located the block_q element at the end * of the list and stored it in 'b'. The next data * buffer to be written is b->block, and there are * b->size remaining bytes. Call the write(2) system * call. After we call write(2), there are three basic * possibilities. Either there is some sort of error * (n<0), there is an EOF condition (n == 0) or a * buffer of data has been read in (n>0). Test for * each case */ if((n = write(outfd, b->block, b->size )) < 0 ) { /* In this case N<0 and there is some sort of * error. If the error is EWOULDBLOCK, we * actually don't care. But we should never get * EWOULDBLOCK since we would not have called * write(2) if select(2) had not indicated to us * that the fd was available. */ if( errno != EWOULDBLOCK) exit_error( "write error on socket" ); debug( "Wierd -- should not be here\n" ); } else if ( n == 0 ) { /* This is the case where there is a pending EOF. * According to Stevens, there is only one situation * where select(2) will indicate a socket is * available for writing but that write(2) returns a * zero value of bytes written. In this case, the * remote host has indicated an EOF condition. * Therefore, if we reach this block of code the * kernel has indicated to use that the writing end * has shutdown the connection from its end. The * remote host does not care for any further data * from us. Our work is now done -- we set a flag * in the netpipe struct netpipe[i].wclose * indicating pending close, and in the interm we * make sure that this netpipe is marked as dormant * in the select(2) I/O vector. */ debug( "EOF on socket, must have closed\n" ); /* Shutdown our end of the connection */ if( shutdown( outfd, SHUT_WR ) < 0 ) /* writing disallowed, sends FIN */ exit_error( "shutdown outfd EOF" ); /* Make infd and outfd uninteresting to select */ FD_CLR( infd, &rset ); FD_CLR( outfd, &wset ); /* Mark this netpipe as wclose'ed */ netpipe[i].wclose = 1; } else { /* This block is passed control if the write(2) * operation was sucessful. We must handle cases * based on how many bytes were actually written, * which could be less than the block presently * stored in block_q. */ debug( "wrote %d bytes to socket %d\n", n, outfd ); /* Optionally display to the user what was in the * dequeued buffer */ #if SHOWDATA debug( "DEQUEUED from pipe %d:", i ); print_buffer( b->block, n ); #endif /* Now check if the buffer has been completely * emptied by comparing the known size of the * buffer to n to the number of bytes written. */ if( n < b->size ) { /* If it has not been completely emptied, we * have to increment b->size and b->block so * that the next time this block_q structure * * is read from, we will pick up where we left * off. */ /* fprintf( stderr, "Short write %d < %d!\n", n, b->size ); */ b->size -= n; b->block += n; } else { /* In this case, the block b has been * completely written out and we can purge * everything inside it. */ b->block = NULL; /* Paranoia */ /* Check to see if we have to update the * previous node. If there is no previous node * in the linked list, then we don't have to * update anything. */ if( prev ) { /* There was a previous entry. Make sure * that this previous entry points to b * (if it doesn't we have a memory leak */ assert( prev->next == b ); /* Indicate in the previous structure that * there is no longer a subsequent * entry. */ prev->next = NULL; } else { /* In this case, there was no previous * entry, so we have just cleared out the * last of the blocks in the FIFO. We do a * number of additional stuff now. */ /* The fifo is now empty */ netpipe[i].fifo = NULL; /* No more stuff to write, so clear outfd * from the list of descriptors for * select(2) */ FD_CLR( outfd, &wset); /* Since we have just flushed the last * block of the fifo, we can close the * connection if there is a pending EOF on * the reading end. Check if there is one, * and send FIN along to outfd. */ if( netpipe[i].rclose ) { debug( "Reading end closed, buffer clear.\n" ); /* Send FIN along to outfd */ if( shutdown( outfd, SHUT_WR ) < 0 ) exit_error( "shutdown outfd EOF" ); /* Take this client out of the select loop */ FD_CLR( infd, &rset ); FD_CLR( outfd, &wset ); /* Mark the pipe wclose -- now the * reaping code can clean up the this * netpipe_t. */ netpipe[i].wclose = 1; } } /* Now we have handles the cases of things * that depend on b, we can get rid of b. We * must free 'b->handle' first, then purge b * in the usual way */ debug( "Freeing %p, %p\n", b, b->handle ); assert(b->handle); /* Paranoia */ assert(b); /* Paranoia */ /* Free the structs and clear them */ free( b->handle ); b->handle = NULL; free( b ); b = NULL; /* Thank god its gone */ } /* End check for last block */ } /* End write if */ } /* If netpipe[i].outfd is writable */ /* The following section of code checks to see if * netpipe[i] is ready to terminate by checking the rclose * and wclose flags. If the netpipe is ready to be * decallocated, we make sure that its descriptors have * been removed out of the rset and wset vectors and we * mark it unused. Additionally, we check if the * terminating netpipe is the last of the pair to leave, * and we call close(2) on the two sockets and decrease * numclients. */ if( netpipe[i].rclose && netpipe[i].wclose ) { debug( "Cleaning up netpipe %d\n", i ); /* Check to see if we are the last of the pair to * leave by seeing if our netpipe[pair_id].used is * unset. */ if(netpipe[netpipe[i].pair_id].used == 0 ) { debug( "Performing final cleanup of %d and %d\n", infd, outfd ); /* We are the last to be using infd and outfd, so * close them out. */ if( close(infd) < 0 ) exit_error( "close infd" ); if( close(outfd) < 0 ) exit_error( "close outfd" ); /* Remove infd and outfd entirely from the vectors * to prevent any descriptors assocated with * either direction of this netpipe pair from * waking the select loop. */ FD_CLR( infd, &wset); FD_CLR( outfd, &rset); /* Finally, we can decrement numclients. This is the only time numclients is ever lowered in the code. */ numclients--; fprintf( stderr, "Client finished\n" ); /* Check to see if we are under the maximum number * of clients now that this client connection has * ended. If so, we can restore 'ls' to rset and * begin accepting things once again. */ if( numclients < MAXCLIENTS ) FD_SET( ls, &rset ); } /* If there are any block_q structures still active, * free(2) them. */ if( netpipe[i].fifo ) { block_q *b = netpipe[i].fifo; debug( "Cleaning up structures in pipe %i\n", i ); /* Loop through the entire linked list freeing everything */ while((b->next != NULL )) { block_q *t = b; b = b->next; free(t->handle); free(t); } } /* Make sure that read and write events no longer wake * us up */ FD_CLR( infd, &rset); FD_CLR( outfd, &wset); /* Mark the netpipe slot as usable again. */ netpipe[i].used = 0; } } } }