//============================================================================= // // File : kvi_sp_ctcp.cpp // Creation date : Thu Aug 16 2000 13:34:42 by Szymon Stefanek // // This file is part of the KVirc irc client distribution // Copyright (C) 1999-2000 Szymon Stefanek (pragma at kvirc dot net) // // This program is FREE software. You can redistribute it and/or // modify it under the terms of the GNU General Public License // as published by the Free Software Foundation; either version 2 // of the License, or (at your opinion) any later version. // // This program is distributed in the HOPE that it will be USEFUL, // but WITHOUT ANY WARRANTY; without even the implied warranty of // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. // See the GNU General Public License for more details. // // You should have received a copy of the GNU General Public License // along with this program. If not, write to the Free Software Foundation, // Inc. ,51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. // //============================================================================= #define __KVIRC__ // FIXME: #warning "CTCP BEEP == WAKEUP == AWAKE" // FIXME: #warning "CTCP AVATARREQ or QUERYAVATAR" #include "kvi_mirccntrl.h" #include "kvi_osinfo.h" #include "kvi_app.h" #include "kvi_sparser.h" #include "kvi_window.h" #include "kvi_out.h" #include "kvi_locale.h" #include "kvi_ircsocket.h" #include "kvi_channel.h" #include "kvi_defaults.h" #include "kvi_channel.h" #include "kvi_query.h" #include "kvi_ircuserdb.h" #include "kvi_iconmanager.h" #include "kvi_modulemanager.h" #include "kvi_sharedfiles.h" #include "kvi_time.h" #include "kvi_fileutils.h" #include "kvi_ctcppagedialog.h" #include "kvi_useraction.h" #include "kvi_options.h" #include "kvi_ircconnection.h" #include "kvi_ircconnectionuserinfo.h" #include "kvi_ircconnectionantictcpflooddata.h" #include "kvi_lagmeter.h" #include "kvi_kvs_eventtriggers.h" #include "kvi_kvs_script.h" #include "kvi_sourcesdate.h" #include "kvi_regusersdb.h" #include #include #ifdef COMPILE_USE_QT4 #include #else #include #endif extern KVIRC_API KviSharedFilesManager * g_pSharedFilesManager; extern KVIRC_API KviCtcpPageDialog * g_pCtcpPageDialog; /* @doc: ctcp_handling @title: KVIrc and CTCP @short: For developers: Client-To-Client Protocol handling in KVIrc @body: [big]Introduction[/big][br] Personally, I think that the CTCP specification is to be symbolically printed & burned. It is really too complex (you can go mad with the quoting specifications) and NO IRC CLIENT supports it completely. Here is my personal point of view on the CTCP protocol.[br] [big]What is CTCP?[/big][br] CTCP stands for Client-to-Client Protocol. It is designed for exchanging almost arbitrary data between IRC clients; the data is embedded into text messages of the underlying IRC protocol.[br] [big]Basic concepts[/big][br] A CTCP message is sent as the part of the PRIVMSG and NOTICE IRC commands.[br] To differentiate the CTCP message from a normal IRC message text we use a delimiter character (ASCII char 1); we will use the symbol <0x01> for this delimiter. You may receive a CTCP message from server in one of the following two ways:[br] [b]: PRIVMSG :<0x01><0x01>[/b][br] [b]: NOTICE :<0x01><0x01>[/b][br] The PRIVMSG is used for CTCP REQUESTS, the NOTICE for CTCP REPLIES. The NOTICE form should never generate an automatic reply.[br] The two delimiters were used to begin and terminate the CTCP message; The origial protocol allowed more than one CTCP message inside a single IRC message. [b]Nobody sends more than one message at once, no client can recognize it (since it complicates the message parsing), it could be even dangerous (see below)[/b]. It makes no real sense unless we wanted to use the CTCP protocol to embed escape sequences into IRC messages, which is not the case.[br] Furthermore, sending more CTCP messages in a single IRC message could be easily used to flood a client. Assuming 450 characters available for the IRC message text part, you could include 50 CTCP messages containing "<0x01>VERSION<0x01>".[br] Since the VERSION replies are usually long (there can be 3 or 4 replies per IRC message), a client that has no CTCP flood protection (or has it disabled) will surely be disconnected while sending the replies, after only receiving a single IRC message (no flood for the sender). From my personal point of view, only [b]one CTCP message per IRC message[/b] should be allowed and theoretically the trailing <0x01> delimiter can be optional.[br] [big]How to extract the CTCP message[/big][br] The IRC messages do not allow the following characters to be sent:[br] (Ascii character 0), (Carriage return), (Line feed).[br] So finally we have four characters that [b]cannot appear literally into a CTCP message[/b]: ,,,<0x01>.[br] To extract a from an IRC PRIVMSG or NOTICE command you have to perform the following actions:[br] Find the part of the IRC message (the one just after the ':' delimiter, or the last message token).[br] Check if the first character of the is a <0x01>, if it is we have a beginning just after <0x01>. The trailing (optional) <0x01> can be removed in this phase or later, assuming that it is not a valid char in the .[br] In this document I will assume that you have stripped the trailing <0x01> and thus from now on we will deal only with the part.[br] [big]Parsing a CTCP message: The quoting dilemma[/big][br] Since there are characters that cannot appear in a , theoretically we should have to use a quoting mechanism. Well, in fact, no actual CTCP message uses the quoting: there is no need to include a , a or inside the actually defined messages (The only one could be CTCP SED, but I have never seen it in action... is there any client that implements it?). We could also leave the "quoting" to the "single message type semantic": a message that needs to include "any character" could have its own encoding method (Base64 for example). With the "one CTCP per IRC message" convention we could even allow <0x01> inside messages. Only the leading (and eventually trailing) <0x01> would be the delimiter, the other ones would be valid characters. Finally, is there any CTCP type that needs <0x01> inside a message? <0x01> is not printable (as well as , and ), so only encoded messages (and again we can stick to the single message semantic) messages or the ones including special parameters. Some machines might allow <0x01> in filenames....well, a file with <0x01> in its name has something broken inside, or the creator is a sort of "hacker" (so he also knows how to rename a file...) :).[br] Anyway, let's be pedantic, and define this quoting method. Let's use the most intuitive method, adopted all around the world:[br] The backslash character ('\') as escape.[br] An escape sequence is formed by the backslash character and a number of following ascii characters. We define the following two types of escape sequences:[br] [b]'\XXX'[/b] (where XXX is an [b]octal number[/b] formed by three digits) that indicates the ascii character with code that corresponds to the number.[br] [b]'\C'[/b] (where C is a [b]CTCP valid ascii non digit character[/b]) that corresponds literally to the character C discarding any other semantic that might be associated with it (This will become clear later). I've choosen the octal rappresentation just to follow a bit the old specification: the authors seemed to like it. This point could be discussed in some mailing list or sth. The '\C' sequence is useful to include the backslash character (escape sequence '\\').[br] [big]Let's mess a little more[/big][br] A CTCP message is made of [b]space separated parameters[/b].[br] The natural way of separating parameters is to use the space character. We define a "token" as a sequence of valid CTCP characters not including literal space. A is usally a token, but not always; filenames can contain spaces inside names (and it happens very often!). So one of the parameters of CTCP DCC is not a space separated token. How do we handle it? Again a standard is missing. Some clients simply change the filename placing underscores instead of spaces, this is a reasonable solution if used with care. Other clients attempt to "isolate" the filename token by surrounding it with some kind of quotes, usually the '"' or ''' characters. This is also a good solution. Another one that naturally comes into my mind is to use the previously defined quoting to define a "non-breaking space" character, because a space after a backslash could lose its original semantic. Better yet, use the backslash followed by the octal rappresentation of the space character ('\040'). Anyway, to maintain compatibility with other popular IRC clients (such as mIRC), let's include the '"' quotes in our standard: literal (unescaped) '"' quotes define a single token string. To include a literal '"' character, escape it. Additionally, the last parameter of a may be made of multiple tokens. [big]A CTCP parameter extracting example[/big][br] A trivial example of a C "CTCP parameter extracting routine" follows.[br] An IRC message is made of up to 510 useable characters. When a CTCP is sent there is a PRIVMSG or NOTICE token that uses at least 6 characters, at least two spaces and a target token (that can not be empty, so it is at least one character) and finally one <0x01> escape character. This gives 500 characters as maximum size for a complete and thus for a . In fact, the is always smaller than 500 characters; there are usually two <0x01> chars, there is a message source part at the beginning of the IRC message that is 10-15 characters long, and there is a ':' character before the trailing parameter. Anyway, to really be on the "safe side", we use a 512 character buffer for each . Finally, I'll assume that you have already ensured that the that we are extracting from is shorter than 511 characters in all, and have provided a buffer big enough to avoid this code segfaulting. I'm assuming that msg_ptr points somewhere in the and is null-terminated.[br] (There are C++ style comments, you might want to remove them) [example] const char * decode_escape(const char * msg_ptr,char * buffer) { // This one decodes an escape sequence // and returns the pointer "just after it" // and should be called when *msg_ptr points // just after a backslash char c; if((*msg_ptr >= '0') && (*msg_ptr < '8')) { // a digit follows the backslash c = *msg_ptr - '0'; msg_ptr++; if(*msg_ptr >= '0') && (*msg_ptr < '8')) { c = ((c << 3) + (*msg_ptr - '0')); msg_ptr++; if(*msg_ptr >= '0') && (*msg_ptr < '8')) { c = ((c << 3) + (*msg_ptr - '0')); msg_ptr++; } // else broken message, but let's be flexible } // else it is broken, but let's be flexible // append the character and return *buffer = c; return msg_ptr; } else { // simple escape: just append the following // character (thus discarding its semantic) *buffer = *msg_ptr; return ++msg_ptr; } } const char * extract_ctcp_parameter(const char * msg_ptr,char * buffer,int spaceBreaks) { // this one extracts the "next" ctcp parameter in msg_ptr // it skips the leading and trailing spaces. // spaceBreaks should be set to 0 if (and only if) the // extracted parameter is the last in the CTCP message. int inString = 0; while(*msg_ptr == ' ')msg_ptr++; while(*msg_ptr) { switch(*msg_ptr) { case '\\': // backslash : escape sequence msg_ptr++; if(*msg_ptr)msg_ptr = decode_escape(msg_ptr,buffer); else return msg_ptr; // senseless backslash break; case ' ': // space : separate tokens? if(inString || (!spaceBreaks))*buffer++ = *msg_ptr++; else { // not in string and space breaks: end of token // skip trailing white space (this could be avoided) // and return while(*msg_ptr == ' ')msg_ptr++; return msg_ptr; } break; case '"': // a string begin or end inString = !inString; msg_ptr++; break; default: // any other char *buffer++ = *msg_ptr++; break; } } return msg_ptr; } [/example][br] [big]CTCP parameter semantics[/big][br] The first of a is the : it defines the semantic of the rest of the message.[br] Altough it is a convention to specify the as uppercase letters, and the original specification says that the whole is case sensitive, I'd prefer to follow the IRC message semantic (just to have less "special cases") and treat the whole mssage as [b]case insensitive[/b].[br] The remaining tokens depend on the . A description of known and thus follows.[br] [big]PING[/big][br] [b]Syntax: <0x01>PING <0x01>[/b][br] The PING request is used to check the round trip time from one client to another. The receiving client should reply with exactly the same message but sent through a NOTICE instead of a PRIVMSG. The usually contains an unsigned integer but not necessairly; it is not even mandatory for to be a single token. The receiver should ignore the semantic of .[br] The reply is intended to be processed by IRC clients. [big]VERSION[/big][br] [b]Syntax: <0x01>VERSION<0x01>[/b][br] The VERSION request asks for informations about another user's IRC client program. The reply should be sent thru a NOTICE with the following syntax:[br] <0x01>VERSION <0x01>[br] The preferred form for is "::", but historically clients (and users) send a generic reply describing the client name, version and eventually the used script name. This CTCP reply is intended to be human readable, so any form is accepted. [big]USERINFO[/big][br] [b]Syntax: <0x01>USERINFO<0x01>[/b][br] The USERINFO request asks for informations about another user. The reply should be sent thru a NOTICE with the following syntax:[br] <0x01>USERINFO <0x01>[br] The should be a human readable "user defined" string; [big]CLIENTINFO[/big][br] [b]Syntax: <0x01>CLIENTINFO<0x01>[/b][br] The CLIENTINFO request asks for informations about another user's IRC client program. While VERSION requests the client program name and version, CLIENTINFO requests informations about CTCP capabilities.[br] The reply should be sent thru a NOTICE with the following syntax:[br] <0x01>CLIENTINFO <0x01>[br] The should contain a list of supported CTCP request tags. The CLIENTINFO reply is intended to be human readable. [big]FINGER[/big][br] [b]Syntax: <0x01>FINGER<0x01>[/b][br] The FINGER request asks for informations about another IRC user. The reply should be sent thru a NOTICE with the following syntax:[br] <0x01>FINGER <0x01>[br] The should be a human readable string containing the system username and possibly the system idle time; [big]SOURCE[/big][br] [b]Syntax: <0x01>SOURCE<0x01>[/b][br] The SOURCE request asks for the client homepage or ftp site informations. The reply should be sent thru a NOTICE with the following syntax:[br] <0x01>VERSION <0x01>[br] This CTCP reply is intended to be human readable, so any form is accepted. [big]TIME[/big][br] [b]Syntax: <0x01>TIME<0x01>[/b][br] The TIME request asks for the user local time. The reply should be sent thru a NOTICE with the following syntax:[br] <0x01>TIME