/program/lib/waslib.php - core functions
This file provides various utility routines.
The constants CAPACITY_* are used for group memberships (see accountmanagerlib.php).
construct a link to appropriate legal notices as per AGPLv3 section 5
This routine constructs ready-to-use HTML-code for a link to the Appropriate Legal Notices, which are to be found in /program/about.html. Depending on the highvisibility flag we either generate a text-based link or a clickabel image.
The actual text / image to use depends on the global constant WAS_ORIGINAL. This constant is defined in /program/version.php and it should be TRUE for the original version of Website@School and FALSE for modified versions.
In the former case the anchor looks like 'Powered by Website@School', in the latter case it will look like 'Based on Website@School', which is in line with the requirements from the license agreement for Website@School, see /program/license.html.
IMPORTANT NOTE
Please respect the license agreement and change the definition of WAS_ORIGINAL to FALSE if you modify this program (see /program/version.php). You also should change the file '/program/about.html' and add a 'prominent notice' of your modifications.
Note: a comparable routine can be found in install.php.
construct a tree of nodes in memory
this reads the nodes in the specified area from disk and constructs a tree via linked lists (sort of). If parameter $force is TRUE, the data is read from the database, otherwise a cached version is returned (if available).
Note that this routine also 'repairs' the tree when an orphan is detected. The orphan is automagically moved to the top of the area. Of course, it shouldn't happen, but if it does we are better off with a magically _appearing_ orphan than a _disappearing_ node.
A lot of operations in the page manager work with a tree of nodes in some way, e.g. walking the tree and displaying it or walking the tree and collecting the sections (but not the pages), etc.
The tree starts with a 'root' with key 0 ($tree[0]). This is the starting point of the tree. The nodes at the top level of an area are linked from this root node via the field 'first_child_id'. If there are no nodes in the area, this field 'first_child_id' is zero. The linked list is constructed by using the node_id. All nodes in an area are collected in an array. This array us used to construct the linked lists.
Every node has a parent (via 'parent_id'), where the nodes at the top level have a parent_id of zero; this points to the 'root'. The nodes within a section or at the top level are linked forward via 'next_sibling_id' and backward via 'prev_sibling_id'. A zero indicates the end of the list. Childeren start with 'first_child_id'. A value of zero means: no childeren.
The complete node record from the database is also stored in the tree. This is used extensively throughout the pagemanager; it acts as a cache for all nodes in an area.
Note that we cache the node records per area. If two areas are involved, the cache doesn't work very well anymore. However, this doesn't happen very often; only in case of moving nodes from one area to another (and even then).
try to eliminate the scheme and authority from the two main uri's
This tries to get rid of the scheme and the authority in 'www' and 'progwww', If these two elements are the same, it becomes possible to use a shorter form of the uri when referencing files in 'progwww' from 'www'.
If the scheme and the authority of 'www' and 'progwww' are the same, the returned strings contain only the path elements. If scheme and authority differ, they contain the same as 'www' and 'progwww' respectively.
Examples: www = 'http://www.example.com/site' and progwww = 'http://www.example.com/site/program' yields www_short = '' and wwwprog_short = '/program'.
www = 'http://www.example.com' and progwww = 'http://common.example.com/program' yields www_short idential to www and progwww identical to progwww_short.
The purpose is to be able to generate relative links, e.g. an image in /program/graphics/foo.jpg can be referred to like this
Note that the comparison in this routine is notvery fancy, it can be easily fooled to consider scheme+authority to be different. However, since this routine is only used to compare two values from config.php, it's not likely to cause trouble.
calculate an array with acls related to user $user_id via group memberships
this calculates the related acls for user $user_id. The results are returned as an array keyed by acl_id. It can containt 0 or more elements. The values of the array elements are groupname/capacity-pairs. This routine is referenced from both useraccount.class.php and usermanager.class.php.
translate a numeric capacity code to a readable name
this translates a capacity code into a readable name, e.g. as an item in a dropdown list when dealing with group memberships. The actual codes are defined as constants, e.g. CAPACITY_NONE.
convert a string to another type (bool, int, etc.)
send pending messages/alerts
this goes through all the alert accounts to see if any messages need to be sent out by email. The strategy is as follows. First we collect a maximum of $max_messages alerts in in core (1 trip to the database) Then we iterate through that collection and for every alert we
Assuming that an UPDATE is more or less atomic, we hopefully can get away with an UPDATE with a where clause looking explicitly for the previous value of the message count. If a message was added after retrieving the alerts but before updating, the message count would be incremented (by the other process) which would prevent us from updating. The alert would be left unchanged but including the added message. Worst case: the receiver gets the same list of alerts again and again. I consider that a fair trade off, given the low probability of it happening. (Mmmm, famous last words...)
Bottom line, we don't do locking in this routine.
Note that we add a small reminder to the message buffer about us processing the alert and sending a message. However, we don't set the number of messages to 1 because otherwise that would be the signal to sent this message the next time. We don't want sent a message every $cron_interval minutes basically saying that we didn't do anything since the previous run. (Or is this a feature after all?)
Failures are logged, success are logged as LOG_DEBUG.
retrieve a list of all available area records keyed by area_id
this returns a list of area-records or FALSE if no areas are available The list is cached via a static variable so we don't have to go to the database more than once for this. Note that the returned array is keyed with area_id and is sorted by sort_order. Also note that this list may include areas for which the current user has no permissions whatsoever.
return an integer value specified in the page request or default value if none
return an (unquoted) string value specified in the page request or default value if none
retrieve typed properties (name-value-pairs) from a table
this retrieves the fields 'name', 'value' and 'type' from all records from $tablename that satisfy the condition in $where. The values, which are stored as strings in the database, are converted to their proper value type and stored in the resulting array, keyed by name. The following types are recognised:
get the number of the area the user requested or null if not specified
See discussion of get_requested_node(). We use separate routine because we may want to support index.php/aaa/nnn/.... instead of index.php?area=aaa&node=nnn&...
get the name of the requested file
See discussion of get_requested_node(). Files are served via /file.php via a comparable mechanism: either
http://localhost/file.php/path/to/filename.ext
OR
http://localhost/file.php?file=/path/to/filename.ext
This routine extracts the '/path/to/filename.ext' part.
get the number of the node the user requested or NULL if not specified
This routine exists because nodes and areas are so central to the whole idea of WAS.
Purpose is to retrieve any reqested node_id from the parameters submitted by the user. As a rule this works via name-value-pairs, something like this: index.php?area=aaa&node=nnn. However, if the webserver is configured correctly, we can also accept index.php/aaa/nnn/.... or index.php/nnn/.... which is more proxy-friendly. Using a generic routine like get_parameter_int() would not be sufficient in that case, so there.
Note that the same proxy-friendly 'trick' is used to determine the filename of a file that needs to be served via file.php (see get_requested_filename().
Note that we first look at the proxy-friendly variant. If that doesn't work, we resort to the conventional way of index.php?node=nnn. Also note that the order of the path_info is important. If there is just a single numeric path component, we assume that it is the node value; if there are two numerics our assumption is that the first one is the area id and the second one the node id.
Note that this routine does not validate the requested node in any way other than making sure that IF it is specified, it is an integer value. For all we know it might even be a negative value.
a small utility routine that returns a unique integer
this generates a unique number (starting at 1). This number is guaranteed to be unique during this http-request (or at least until the static variable $id overflows, but that takes a while). If the optional parameter $increment is FALSE, the latest id returned is returned again.
retrieve the records of the groups of which user $user_id is a member
return an integer (bytecount) value from PHP ini
determine if any of the ancestors or $node_id itself is already expired
This climbs the tree upward, starting at $node_id, to see if any nodes are expired. If an expired node is detected, TRUE is returned. If none of the nodes are expired, then FALSE is returned.
Note that this routine looks strictly at the expiry property, it is very well possible that a node is under embargo, see is_under_embargo().
Also note that this routine currently also tries to 'fix' the node database when a circular reference is detected. This doesn't really belong here, but for the time being it is convenient to have this auto-repair mechanism here. The node that is fixed is the section we are looking at after MAXIMUM_ITERATIONS tries, which is not necessarily the node we started with.
determine if any of the ancestors or $node_id itself is under embargo
This climbs the tree upward, starting at $node_id, to see if any nodes are under embargo. If an embargo'ed node is detected, TRUE is returned. If none of the nodes are under embargo, then FALSE is returned.
Note that this routine looks strictly at the embargo property, it is very well possible that a node is expired, see is_expired().
Also note that this routine currently also tries to 'fix' the node database when a circular reference is detected. This doesn't really belong here, but for the time being it is convenient to have this auto-repair mechanism here. The node that is fixed is the section we are looking at after MAXIMUM_ITERATIONS tries, which is not necessarily the node we started with.
massage a message and generate a javascript alert()
put a (co-operative) lock on a record
this tries to set the co-operative) lock on the record with serial (pkey) $id in table $tablename by setting the $locked_by field to our own session_id. This is the companion routine of lock_release().
The mechanism of co-operative locking works as follows. Some tables (such as the 'nodes' table) have an int field, e.g. 'locked_by_session_id'. This field can either be NULL (indicating that the record is not locked) or hold the primary key of a session (indicating that the record is locked and also by which session).
Obtaining a lock boils down to updating the table and setting that field to the session_id. As long as the underlying database system guarantees that execution of an UPDATE statement is not interrupted, we can use UPDATE as a 'Test-And-Set'-function. According to the docentation MySQL does this.
The procedure is as follows.
3. If another session_id holds the lock, we check for that session's existence. If it still exists, we're out of luck: we can't obtain the lock.
4. If that other session does no longer exist, we try to replace that other session's session_id with our own session_id, once again using a single UPDATE (avoiding another race condition). If that succeeds we're done and we have the lock; if it failes we're also done but without lock.
If locking the record fails because the record is already locked by another session, this routine returns information about that other session in $lockinfo. It is up to the caller to use this information or not.
Note. A record can stay locked if the webbrowser of the locking session has crashed. Eventually this will be resolved if the crashed session is removed from the sessions table. However, the user may have restarted her browser while the record was locked. From the new session it appears that the record is still locked. This may take a while. Mmmmm... The other option is to lock on a per-user basis rather than per-session basis. Mmmm... Should we ask the user to override the session if it happens to be the same user? Mmm. put it on the todo list. (A small improvement might be to call the garbage collection between step 2 and 3. Oh well).
get record lock on a node
this is a wrapper around lock_record() for locking nodes.
unlock a record that was previously successfully locked
this removes the co-operative) lock on the record with serial (pkey) $id in table $tablename by setting the $locked_by field to NULL. This is the companion routine of lock_record().
release lock on a node
this is a wrapper around lock_release() for unlocking nodes.
a simple function to log information to the database 'for future reference'
This adds a message to the table log_messages, including a time, the remote address and (of course) a message. See also the standard PHP-function syslog(). We use the existing symbolic constants for priority. Default value is LOG_INFO.
Note that messages with a priority LOG_DEBUG are only written to the log if the global parameter $CFG->debug is TRUE. All other messages are simply logged, no further questions asked.
If the caller does not provide a user_id, this routine attempts to read the user_id from the global $_SESSION array, i.e. we try to link events to a particular user if possible.
Note that with a field definition of varchar(255) there is room to store either an IPv4 address (max 15 bytes) or a full-blown IPv6 address (max 39 bytes).
this circumvents the 'magic' in magic_quotes_gpc() by conditionally stripping slashes
Magic quotes are a royal pain for portability. If magic quotes are enabled, this function reverses the effect. There are three PHP-parameters in php.ini affecting the magic:
This routine should be used to unquote strings from $_GET[], $_POST[] and $_COOKIE whenever they are needed.
Important note: because third party subsystems may deal with magic quotes on their own, it is a Bad Idea[tm] to globally replace the contents of $_GET[], $_POST[] and $_COOKIE with the unescaped values once at program start. Any subsystem would be confused if magic_quotes_gpc() indicates that the magic is in effect whereas in reality the magic was already undone at program start. Yes, this yields a performance penalty, but this magic was a mess right from the start. Hopefully PHP6 will get rid of this magic for once and for all...
return the number of database queries that was executed
return the script execution time
generate a string with quasi-random characters
This generates a string of $length quasi-random characters. The optional parameter $candidates determines which characters are elegible. Popular choices for $candidates are:
Note that this is an ASCII-centric routine: we only use plain ASCII letters and digits and nothing of the 64000 other UNicode characters in the Basic Multilingual Plane. The reason is simple: 7-bit ASCII characters have the best chance of getting through communiocation channels unmangled so there.
convert string $s from native format to quoted printable (RFC2045)
this converts the input string $s to quoted printable form as defined in RFC2045 (see http://www.ietf.org/rfc/rfc2045.txt). By default this routine assumes a line-oriented text input. This can be overruled by calling the routine with the parameter $textmode set to FALSE: in that case the input is considered to be a binary string with no embedded newlines.
The routine assumes that the input lines are delimited with $newline. By default this parameter is a LF (Linefeed) but it could be changed to another delimiter using the function parameter $newline.
According to RFC2045 the resulting output lines should be no longer than 76 bytes, even though it is very well possible to use shorter lines. This can be done by setting the parameter $max_length to the desired value. Note that this value is forced to be in the range 4,...,76.
The encoding is defined in section 6.7 of RFC2045 with these five rules.
(1) General 8bit representation: any character may be represented as "=" followed by two uppercase hexadecimal digits.
(2) Literal representation characters "!" to "~" but excluding the "=" may represent themselves.
(3) White space Space " " and tab "\t" at the end of a line must use rule (1); in all other cases either rule (1) or (2) may be applied.
(4) Line breaks The (hard) line breaks in the input must be represented using "\r\n" in the output.
(5) Soft line breaks Output lines may not be longer than 76 bytes. This can be enforced by inserting a soft line break (the string "=\r\n") in the output. This soft line break will disappear once the encoded string is decoded.
The basic conversion algoritm is constructed using two important variables:
Note that spaces (ASCII 32) and tabs (ASCII 9) are treated differently depending on their position in the line. The rule is that both should be represented as "=20" or "=09" at the end of an input line and that it is allowed to use " " or "\t" when NOT at the end of an input line. In the latter case, the output line will allways end with a soft line break "=\r\n" which makes sure that there are not trailing spaces/tabs in the output line anyway.
Also note that the end of the input $s is also flagged via setting $next_is_newline. This is an optimalisation which treats spaces and tabs at the end of the input as if they were at the end of an input line, ie. converting to "=20" or "=09". This means that the output will never end with a space of a tab, even if the input does.
Note that in case of a binary conversion the input character(s) that might otherwise indicate a newline are to be considered as binary data. However, if the data is completely binary, it probably doesn't make sense to use Quoted-Printable in the first place (base64 would probably be a better choice).
Reference: see http://www.ietf.org/rfc/rfc2045.txt.
redirect to another url by sending an http header
unfold a possible multiline string
This removes all linefeeds and carriage returns from a string Typical use would be to strip a subject line in a mailmessage from newlines which might interfere with proper sending of mail headers.
sanitise a string to make it acceptable as a filename/directoryname
this routine analyses and maybe converts the input string as follows:
Also note that a full pathname will yield something that looks like a simple filename without directories or drive letter: C:\Program Files\Apache Group\htpasswd becomes C_Program_Files_Apache_Group_htpasswd and /etc/passwd becomes etc_passwd. Also this routine makes a URL look like a filename: http://www.example.com becomes http_www.example.com.
Finally note that we don't even attempt to transliterate utf8-characters or any other characters between 128 and 255; these are simply removed.
convert a string representation of a date/time to a timestamp
this is a crude date/time parser. We collect digits and convert to integers. With the integers we fill an array with at least 6 integers, corresponding to year, month, day, hours, minutes and seconds. If there are less than six numbers in the source string the value 0 is used. for the remaining elements. Note that a number in this context is always a non-negative number because a dash (or minus) is considered a delimiter.
Note that valid date/time values are limited to how many seconds can be represented in a signed long integer, where 0 equates to 1970-01-01 00:00:00 (the Unix epoch). The upper limit for a 32-bit int is some date in 2038 (only 30 years from now).
translation of phrases via a function with a very short name
This is only a wrapper function for $LANGUAGE->get_phrase()
Documentation generated on Wed, 11 May 2011 23:45:46 +0200 by phpDocumentor 1.4.0