File/program/lib/waslib.php

Description

/program/lib/waslib.php - core functions

This file provides various utility routines.

Constants
CAPACITY_CHAIR = 8 (line 38)
CAPACITY_CUSTOM1 = 11 (line 41)
CAPACITY_CUSTOM2 = 12 (line 42)
CAPACITY_CUSTOM3 = 13 (line 43)
CAPACITY_CUSTOM4 = 14 (line 44)
CAPACITY_CUSTOM5 = 15 (line 45)
CAPACITY_CUSTOM6 = 16 (line 46)
CAPACITY_CUSTOM7 = 17 (line 47)
CAPACITY_CUSTOM8 = 18 (line 48)
CAPACITY_CUSTOM9 = 19 (line 49)
CAPACITY_EDITOR = 9 (line 39)
CAPACITY_MEMBER = 4 (line 34)
CAPACITY_NEXT_AVAILABLE = 20 (line 50)
CAPACITY_NONE = 0 (line 30)

The constants CAPACITY_* are used for group memberships (see accountmanagerlib.php).

CAPACITY_PRINCIPAL = 3 (line 33)
CAPACITY_PROJECTLEAD = 5 (line 35)
CAPACITY_PUBLISHER = 10 (line 40)
CAPACITY_PUPIL = 1 (line 31)
CAPACITY_SECRETARY = 7 (line 37)
CAPACITY_TEACHER = 2 (line 32)
CAPACITY_TREASURER = 6 (line 36)
QUASI_RANDOM_DIGITS = 10 (line 253)
QUASI_RANDOM_DIGITS_UPPER = 36 (line 255)
QUASI_RANDOM_DIGITS_UPPER_LOWER = 62 (line 256)
QUASI_RANDOM_HEXDIGITS = 16 (line 254)
Functions
appropriate_legal_notices (line 1794)

construct a link to appropriate legal notices as per AGPLv3 section 5

This routine constructs ready-to-use HTML-code for a link to the Appropriate Legal Notices, which are to be found in /program/about.html. Depending on the highvisibility flag we either generate a text-based link or a clickabel image.

The actual text / image to use depends on the global constant WAS_ORIGINAL. This constant is defined in /program/version.php and it should be TRUE for the original version of Website@School and FALSE for modified versions.

In the former case the anchor looks like 'Powered by Website@School', in the latter case it will look like 'Based on Website@School', which is in line with the requirements from the license agreement for Website@School, see /program/license.html.

IMPORTANT NOTE

Please respect the license agreement and change the definition of WAS_ORIGINAL to FALSE if you modify this program (see /program/version.php). You also should change the file '/program/about.html' and add a 'prominent notice' of your modifications.

Note: a comparable routine can be found in install.php.

  • return: ready-to-use HTML
string appropriate_legal_notices ([bool $text_only = FALSE], [string $m = ''])
  • bool $text_only: if TRUE we return a text-only link, otherwise a clickable image
  • string $m: margin to improve readability of generated code
calculate_uri_shortcuts (line 189)

try to eliminate the scheme and authority from the two main uri's

This tries to get rid of the scheme and the authority in 'www' and 'progwww', If these two elements are the same, it becomes possible to use a shorter form of the uri when referencing files in 'progwww' from 'www'.

If the scheme and the authority of 'www' and 'progwww' are the same, the returned strings contain only the path elements. If scheme and authority differ, they contain the same as 'www' and 'progwww' respectively.

Examples: www = 'http://www.example.com/site' and progwww = 'http://www.example.com/site/program' yields www_short = '' and wwwprog_short = '/program'.

www = 'http://www.example.com' and progwww = 'http://common.example.com/program' yields www_short idential to www and progwww_short identical to progwww.

The purpose is to be able to generate relative links, e.g. an image in /program/graphics/foo.jpg can be referred to like this

    <img src="{$CFG->progwww_short}/graphics/foo.jpg"> or <img src="/program/graphics/foo.jpg"> rather than <img src="http://www.example.com/program/graphics/foo.jpg">

Note that the comparison in this routine is not very fancy, it can be easily fooled to consider scheme+authority to be different. However, since this routine is only used to compare two values from config.php, it's not likely to cause trouble.

  • return: the two short versions of www and progwww, if possible
array calculate_uri_shortcuts (string $www, string $progwww)
  • string $www: the uri (scheme / authority / path) of the directory holding config.php
  • string $progwww: the uri (scheme / authority / path) corresponding with the program directory
calc_user_related_acls (line 1429)

calculate an array with acls related to user $user_id via group memberships

this calculates the related acls for user $user_id. The results are returned as an array keyed by acl_id. It can containt 0 or more elements. The values of the array elements are groupname/capacity-pairs. This routine is referenced from both useraccount.class.php and usermanager.class.php.

  • return: 0, 1 or more acl_id => groupname/capacity pairs
array calc_user_related_acls (int $user_id)
  • int $user_id: the user we're looking at
capacity_name (line 1406)

translate a numeric capacity code to a readable name

this translates a capacity code into a readable name, e.g. as an item in a dropdown list when dealing with group memberships. The actual codes are defined as constants, e.g. CAPACITY_NONE.

  • return: readable name of capacity
string capacity_name (int $capacity)
  • int $capacity: numeric code of capacity
convert_to_type (line 1489)

convert a string to another type (bool, int, etc.)

  • return: the value $value casted to the proper type
  • todo: perhaps change the possible values of $type to full strings rather than 'cryptic' single letter codes. Furthermore: what do we do with invalid dates, times and date/times? For now it is a stub, returning $value as-is. Oh well.
mixed convert_to_type (string $type, string $value)
  • string $type: new type for $value: b=bool, i=integer, s=string, etc.
  • string $value: the value to convert to tye $type
cron_send_queued_alerts (line 996)

send pending messages/alerts

this goes through all the alert accounts to see if any messages need to be sent out by email. The strategy is as follows. First we collect a maximum of $max_messages alerts in core (1 trip to the database) Then we iterate through that collection and for every alert we

  1. construct and send an email message
  2. update the record (reset the message buffer and message count) (+1 trip to the database)
Locking and unlocking would be even more expensive, especially when chances of race conditions are not so big. (An earlier version of this routine went to the database once for the list of all pending alerts and subsequently twice for each alert but eventually I considered that too expensive too).

Assuming that an UPDATE is more or less atomic, we hopefully can get away with an UPDATE with a where clause looking explicitly for the previous value of the message count. If a message was added after retrieving the alerts but before updating, the message count would be incremented (by the other process) which would prevent us from updating. The alert would be left unchanged but including the added message. Worst case: the receiver gets the same list of alerts again and again. I consider that a fair trade off, given the low probability of it happening. (Mmmm, famous last words...)

Bottom line, we don't do locking in this routine.

Note that we add a small reminder to the message buffer about us processing the alert and sending a message. However, we don't set the number of messages to 1 because otherwise that would be the signal to sent this message the next time. We don't want sent a message every $cron_interval minutes basically saying that we didn't do anything since the previous run. (Or is this a feature after all?)

Failures are logged, successes are logged as WLOG_DEBUG.

  • return: the number of messages that were processed
int cron_send_queued_alerts ([int $max_messages = 10])
  • int $max_messages: do not send more than this number of messages
friendly_bookmark (line 2034)

construct an alphanumeric string from a (node) title yielding a readable bookmark filename

this strips everything from $title except alphanumerics. Runs of other characters are translated to a single underscore. Length of result is limited to a length of $maxlen bytes (default 50). This includes the length of the extension $ext.

Note that the $title is UTF-8 and may contain non-ASCII characters. Ths routine deals with that situation by first converting the UTF-8 string to ASCII as much as possible (e.g. convert 'e-aigu' to plain 'e') and subsequently converting all remaining non-letters/digits to a underscores.

Finally the result is stripped from leading/trailing underscores. If this yields a non-empty string, the extension $ext (default '.html') is appended.

Note: this route works best with latin-like text; if $title is completely written in Chinese (or other UTF-8 characters without a corresponding ASCII replacement) we end up with a single underscore which is subsequently trim()'ed, yielding an empty string and no $ext added. I am not sure what to do about that.

Note: the extension is not checked for non-alphanumerics because this is the responsability of the caller to provide a decent $ext if the default '.html' is not used.

  • return: string with only alphanumerics and underscores, max $maxlen chars
  • usedby: was_node_url()
string friendly_bookmark (string $title, [int $maxlen = 50], [int $ext = '.html'])
  • string $title: input text
  • int $maxlen: the maximum length of the result
  • int $ext: the filename extension added to a non-empty result
get_area_records (line 1361)

retrieve a list of all available area records keyed by area_id

this returns a list of area-records or FALSE if no areas are available The list is cached via a static variable so we don't have to go to the database more than once for this. Note that the returned array is keyed with area_id and is sorted by sort_order. Also note that this list may include areas for which the current user has no permissions whatsoever and also areas that are inactive.

  • return: FALSE if no areas available or an array with area-records
array|bool get_area_records ([bool $forced = FALSE])
  • bool $forced: if TRUE forces reread from database (resets the cache)
get_cookie_string (line 558)

return an (unquoted) string value specified in the cookie header or default value if none

This validates and magic_unquotes() the specified cookie and returns either the valid UTF8 value or the UTF-8 substitution. If the cookie is not set in _COOKIE, the default value is returned. It is the responsability of the caller to provide a workable default value.

  • return: the value of the parameter or the default value if not specified
mixed get_cookie_string (string $name, [mixed $default_value = NULL])
  • string $name: the name of the cookie
  • mixed $default_value: the value to return if cookie was not found
get_csrftoken (line 2278)

get csrf token name and value

this retrieves the current csrf token name and token value from $_SESSION. If one of those is not already set we simply dream up a new one. This routine ALWAYS returns an array with two elements

  • return: contains two elements with token name and value
array get_csrftoken ()
get_editor_names (line 2395)

prepare a list of available editors

this routine returs a hardcoded list of available editors: we do not expect to be adding or removing editors to/from the CMS on a regular basis, even though CKEditor 3 was added in March 2012 and CKEditor 4 was added in November 2014.

It might be cleaner to base this list on the site configuration options in the config table: a picklist of available editors is available in the 'editor' parameter in the table 'config'. The actual implementation of editors is done in dialog_get_widget_richtextinput() in dialoglib.php.

Here we (re-)use the translations for the (short) editor option and (long) editor name from the site config dialogs, e.g. via a constructed key 'site_config_editor_{$editor}_option'.

Note: This routine is used by both the User Manager and the MyPage module.

  • return: list of available editors
  • todo: retrieve this list from 'config'-table?
array get_editor_names ()
get_friendly_parameter (line 2098)

retrieve a named parameter from the friendly URL

This routine attempts to parse the PATH_INFO server variable and extract the parameters and values stored in the path components. (see also was_node_url()).

Example: the URL

    /was/index.php/35/photo/5/Picture_of_our_field_trip.html
is broken down as follows:
  • /35 is the first non-empty parameter and also is completely numeric and hence interpreted as a node_id;
  • /photo/5 is considered a key-value-pair with key=photo and value=5;
  • /Picture_of_our_field_trip.html is the last component and is discarded
The static array which caches the results of the parsing will contain this: $parameters = array('node' => 35, 'photo' => 5);

Note that all parameters are checked for valid UTF-8. If either key or value is NOT UTF-8, the pair is silently discarded. This prevents tricks with overlong sequences and other UTF-8 black magic.

Once the parsed friendly path is cached the parameter $name is looked up. If found, the corresponding value is returned. If it is not found, $default_value is returned.

The cache is rebuilt if $force is TRUE (should never be necessary)

Note: the parameter 'node' is a special case: if it is specified it is the first parameter. This parameter otherwise is unnamed.

  • return: either the $default_value or the value of the named parameter
mixed get_friendly_parameter (string $name, [mixed $default_value = NULL], [bool $force = FALSE])
  • string $name: the parameter we need to look for
  • mixed $default_value: is returned if the parameter was not found
  • bool $force: if TRUE forces the parsing to be redone
get_module_records (line 1383)

retrieve a list of all available module records

this returns a list of active module-records or FALSE if none are are available The list is cached via a static variable so we don't have to go to the database more than once for this. Note that the returned array is keyed with module_id.

  • return: FALSE if no modules available or an array with module-records
array|bool get_module_records ([bool $forced = FALSE])
  • bool $forced: if TRUE forces reread from database (resets the cache)
get_page_address_url (line 2354)

return the reconstructed URL in a single (indented) line

This constructs the URL (including the GET-parameters and PATH_INFO) of the current script. The current script is identified using the basename of the entry point which is available in the global constant WASENTRY.

This URL is returned as HTML so it can be displayed. It is NOT meant to be a clickable link, but as a documentation of the actual URL that was used. Note that this URL can be suppressed by an appropriate 'display:none' in the stylesheet, making it an item that only appears on a hardcopy (media="print") and not on screen.

If somehow the input is invalid UTF-8, we replace the offending strings with the unicode substitution character U+FFFD in UTF-8 encode form (ie. the three character string 0xEF 0xBF 0xBD).

Note that we do need magic_unquote() because we are dealing with the PHP-version of parameters in $_GET[] and $_SERVER[] which - unfortunately - have magic quotes. We cannot use the routines because we would miss the last part of the PATH_INFO (the dummy 'filename'). Also, we need to validate the code for UTF-8 validity; we still do not want a malicious user somehow abusing /%C0%AE%2E/ (overlong UTF-8 equivalent of /../) to traverse the directory tree.

Note that two variants of this routine used to live in class Theme and class AdminOutput. Wrappers remain, though.

  • return: reconstructed URL as text
string get_page_address_url ([string $m = ''])
  • string $m: left margin for increased readability
get_parameter_int (line 514)

return an integer value specified in the page request or default value if none

this routine first checks the friendly url to see of the requested parameter is specified there. If it is, we will use it unless there is also a parameter in $_GET that prevails. If the parameter is not specified at all, the $default_value is returned. It is the responsability of the caller to provide a workable default value.

Note that invalid UTF-8 is silently discarded.

  • return: the value of the parameter or the default value if not specified
mixed get_parameter_int (string $name, [mixed $default_value = NULL])
  • string $name: the name of the parameter to retrieve the value of
  • mixed $default_value: the value to return if parameter was not specified
get_parameter_string (line 538)

return an (unquoted) string value specified in the page request or default value if none

First check out the friendly url for the named parameter. If it exists, we use that, otherwise we have the $default_value. After that the valid UTF-8 value may overwrite the value found in the friendly url (or the default value).

It is the responsability of the caller to provide a workable default value.

Note that invalid UTF-8 is silently discarded.

  • return: the value of the parameter or the default value if not specified
mixed get_parameter_string (string $name, [mixed $default_value = NULL])
  • string $name: the name of the parameter to retrieve the value of
  • mixed $default_value: the value to return if parameter was not specified
get_properties (line 116)

retrieve typed properties (name-value-pairs) from a table

this retrieves the fields 'name', 'value' and 'type' from all records from $tablename that satisfy the condition in $where. The values, which are stored as strings in the database, are converted to their proper value type and stored in the resulting array, keyed by name. The following types are recognised:

  • b = boolean
  • d = date ('yyyy-mm'dd', handled like a string)
  • dt = datetime ('yyyy-mm-dd hh:mm:ss', handled like a string)
  • f = float
  • i = integer
  • s = string
  • t = time ('hh:mm:ss', handled like a string)
Note that we currently do not validate these properties, the assumption is that the values are valid (or empty).

  • return: FALSE on error, or an array with name-value-pairs
bool|array get_properties ([string $tablename = 'config'], [array|string $where = ''])
  • string $tablename: the name of the table holding the properties
  • array|string $where: which records do we need to select
get_requested_area (line 467)

get the number of the area the user requested or null if not specified

See discussion of get_requested_node().

  • return: integer indicating the area or null if none specified
int|null get_requested_area ()
get_requested_filename (line 489)

get the name of the requested file

See discussion of get_requested_node(). Files are served via /file.php via a comparable mechanism: either

http://exemplum.eu/was/file.php/path/to/filename.ext

OR

http://exemplum.eu/was/file.php?file=/path/to/filename.ext

This routine extracts the '/path/to/filename.ext' part.

Note that we require valid UTF-8. If the path is not UTF-8, we return NULL.

  • return: requested filename or null if none specified or invalid UTF-8
string|null get_requested_filename ()
get_requested_node (line 456)

get the number of the node the user requested or NULL if not specified

This routine exists because nodes and (to a lesser extent) areas are so central to the whole idea of WAS.

A specific node can be requested in two different ways, for example page 35 with an additional parameter 'photo' with value 7 is called either via

    http://exemplum.eu/was/index.php/was/index.php/35/photo/5/Picture_of_our_field_trip.html
or
    http://exemplum.eu/was/index.php/was/index.php?node=35&photo=5

The routine get_parameter_int() with a default value of NULL yields 35 in both cases.

A node can also be specified implicitly, e.g. via

    http://exemplum.eu/was/index.php/was/index.php/area/1
or
    http://exemplum.eu/was/index.php/was/index.php?area=1
which yields the default node for area 1, or simply
    http://exemplum.eu/was/index.php/was/index.php
which yields the default node in the default area.

Important note: In previous versions of this routine (and get_requested_area()) we also accepted constructs like

    http://exemplum.eu/was/index.php/was/index.php/35 http://exemplum.eu/was/index.php/was/index.php/1/35
but this has a great disadvantage that the idea of an ever growing list of integers (or more general: positional parameters) is not very handy in the long run. For instance: how to convey that we want to see photo #5 on page #35? 'index.php/35/5'? How is that to be interpreted different from page 5 in area 35? I now think the better approach is to use key/value-pairs in the friendly url path, and also get rid of the unnamed int indicating 'area'. The latter wasn't really usefull anyway, because specifying a node IMPLIES an area and it could even cause trouble if a bookmarked area+node would be moved to another area: the bookmark would yield an error message rather than the node (in another area) or the default node in the bookmarked area. All in all this change makes this routine exetremely simple: it almost another name for get_parameter_int(). It is still possible to specify both node AND area (allthough there is no need):
    http://exemplum.eu/was/index.php/was/index.php/35/area/1 http://exemplum.eu/was/index.php/was/index.php/node/35/area/1 http://exemplum.eu/was/index.php/was/index.php/area/1/node/35

Note that this routine does not validate the requested node in any way other than making sure that IF it is specified, it is valid UTF-8 and it is an integer value. For all we know it might even be a negative value.

  • return: integer indicating the node or NULL if none specified
int|null get_requested_node ()
get_skin_names (line 2416)

prepare a list of available skins

this routine returs a hardcoded list of available skins: we do not expect to be adding or removing skins to/from the CMS any time soon.

Note: This routine is used by both the User Manager and the MyPage module.

  • return: list of available skins
array get_skin_names ()
get_unique_number (line 1755)

a small utility routine that returns a unique integer

this generates a unique number (starting at 1). This number is guaranteed to be unique during this http-request (or at least until the static variable $id overflows, but that takes a while). If the optional parameter $increment is FALSE, the latest id returned is returned again.

  • return: a new unique value every time
int get_unique_number ([bool $increment = TRUE])
  • bool $increment: optional indicates whether the static counter must be incremented
get_user_groups (line 1458)

retrieve the records of the groups of which user $user_id is a member

  • return: 0, 1 or more acl_id => $group_record pairs
  • uses: $DB
array get_user_groups (int $user_id)
  • int $user_id: the user we're looking at
hmac (line 2251)

calculate hmac according to RFC2104 (February 1997)

Note: strings $opad and $ipad are created by simply copying $key The contents are not important because we overwrite the contents in the loop anyway.

  • return: hashed message authentication code of $message
string hmac (string $key, string $message, [bool $raw = FALSE], [function $hash = &quot;sha1&quot;])
  • string $key: (shared) secret key
  • string $message
  • bool $raw: TRUE return binary hmac, FALSE hexadecimal
  • function $hash: either sha1 (default) or md5
ini_get_int (line 1590)

return an integer (bytecount) value from PHP ini

  • return: value expressed in bytes
int ini_get_int (string $variable)
  • string $variable: name of the variable to retrieve, e.g. 'upload_max_filesize'
is_expired (line 1323)

determine if any of the ancestors or $node_id itself is already expired

This climbs the tree upward, starting at $node_id, to see if any nodes are expired. If an expired node is detected, TRUE is returned. If none of the nodes are expired, then FALSE is returned.

Note that this routine looks strictly at the expiry property, it is very well possible that a node is under embargo, see is_under_embargo().

Also note that this routine currently also tries to 'fix' the node database when a circular reference is detected. This doesn't really belong here, but for the time being it is convenient to have this auto-repair mechanism here. The node that is fixed is the section we are looking at after MAXIMUM_ITERATIONS tries, which is not necessarily the node we started with.

  • return: TRUE if any ancestor (or node_id) is expired, otherwise FALSE
  • todo: this function also 'repairs' circular references. This should move to a separate tree-repair function but for the time being it is "convenient" to have automatic repairs...
  • uses: $DB
bool is_expired (int $node_id, array &$tree)
  • int $node_id
  • array &$tree: family tree
is_under_embargo (line 1275)

determine if any of the ancestors or $node_id itself is under embargo

This climbs the tree upward, starting at $node_id, to see if any nodes are under embargo. If an embargo'ed node is detected, TRUE is returned. If none of the nodes are under embargo, then FALSE is returned.

Note that this routine looks strictly at the embargo property, it is very well possible that a node is expired, see is_expired().

Also note that this routine currently also tries to 'fix' the node database when a circular reference is detected. This doesn't really belong here, but for the time being it is convenient to have this auto-repair mechanism here. The node that is fixed is the section we are looking at after MAXIMUM_ITERATIONS tries, which is not necessarily the node we started with.

  • return: TRUE if any ancestor (or node_id) is under embargo, otherwise FALSE
  • todo: this function also 'repairs' circular references. This should move to a separate tree-repair function but for the time being it is "convenient" to have automatic repairs...
  • uses: $DB
bool is_under_embargo (array &$tree, int $node_id)
  • array &$tree: family tree
  • int $node_id
javascript_alert (line 298)

massage a message and generate a javascript alert()

  • return: javascript code with alert() function call with properly escaped message string
string javascript_alert (string $message)
  • string $message: message to display
lock_record (line 712)

put a (co-operative) lock on a record

this tries to set the co-operative) lock on the record with serial (pkey) $id in table $tablename by setting the $locked_by field to our own session_id. This is the companion routine of lock_release().

The mechanism of co-operative locking works as follows. Some tables (such as the 'nodes' table) have an int field, e.g. 'locked_by_session_id'. This field can either be NULL (indicating that the record is not locked) or hold the primary key of a session (indicating that the record is locked and also by which session).

Obtaining a lock boils down to updating the table and setting that field to the session_id. As long as the underlying database system guarantees that execution of an UPDATE statement is not interrupted, we can use UPDATE as a 'Test-And-Set'-function. According to the docentation MySQL does this.

The procedure is as follows.

  1. we try to set the locked_by-field to our session_id on the condition that the previous value of that field is NULL. If this succeeds, we have effectively locked the record.
2. If this fails, we retrieve the current value of the field to see which session has locked it. If this happens to be us, we had already locked the record before and we're done.

3. If another session_id holds the lock, we check for that session's existence. If it still exists, we're out of luck: we can't obtain the lock unless $force is TRUE. In that case we simply overrule the current lock and make it ours, if and ony if the existing lock was granted to our user_id.

4. If that other session does no longer exist, we try to replace that other session's session_id with our own session_id, once again using a single UPDATE (avoiding another race condition). If that succeeds we're done and we have the lock; if it failes we're also done but without lock.

If locking the record fails because the record is already locked by another session, this routine returns information about that other session in $lockinfo. It is up to the caller to use this information or not.

Note. A record can stay locked if the webbrowser of the locking session has crashed. Eventually this will be resolved if the crashed session is removed from the sessions table. However, the user may have restarted her browser while the record was locked. From the new session it appears that the record is still locked. This may take a while. Mmmmm... The other option is to lock on a per-user basis rather than per-session basis. Mmmm... Should we ask the user to override the session if it happens to be the same user? Mmm. put it on the todo list. (A small improvement might be to call the garbage collection between step 2 and 3. Oh well).

  • return: TRUE if succes lockeding, FALSE on error or already locked ; extra info return in $lockinfo
  • todo: do we need a 'force lock' option to forcefully take over spurious locks? Maybe guru can override?
  • todo: perhaps we can save 1 trip to the database by checking for something like UPDATE SET locked_by = $session_id WHERE (id = $id) AND ((locked_by IS NULL) OR (locked_by = $session_id)) but I don't know how many affected rows that would yield if we already had the lock and effectively nothing changes in the record. (Perhaps always update atime to force 1 affected row?)
  • usedby: lock_release_node()
  • usedby: lock_record_node()
bool lock_record (int $id, array &$lockinfo, string $tablename, string $pkey, string $locked_by, string $locked_since, [bool $force = FALSE])
  • int $id: the primary key of the record to lock
  • array &$lockinfo: returns information about the session that already locked this record
  • string $tablename: the name of the table
  • string $pkey: name of the field holding the serial (pkey)
  • string $locked_by: name of the field to hold our session_id indicating we locked the record
  • string $locked_since: name of the field holding the datetime when the lock was obtained
  • bool $force: TRUE means we grab the current lock from our other session
lock_record_node (line 629)

get record lock on a node

this is a wrapper around lock_record() for locking nodes.

  • return: TRUE if locked succesfully, FALSE on error or already locked ; extra info returned in $lockinfo
  • uses: lock_record()
bool lock_record_node (int $node_id, array &$lockinfo, [bool $force = FALSE])
  • int $node_id: the primary key of the node to lock
  • array &$lockinfo: returns information about the session that already locked this record
  • bool $force: TRUE means we grab the current lock from our other session
lock_release (line 816)

unlock a record that was previously successfully locked

this removes the co-operative) lock on the record with serial (pkey) $id in table $tablename by setting the $locked_by field to NULL. This is the companion routine of lock_record().

  • return: TRUE if locked removed succesfully, FALSE on error or lock not found
bool lock_release (int $id, string $tablename, string $pkey, string $locked_by, string $locked_since)
  • int $id: the primary key of the record to unlock
  • string $tablename: the name of the table
  • string $pkey: name of the field holding the serial (pkey)
  • string $locked_by: name of the field holding the session_id of the session that locked the record
  • string $locked_since: name of the field holding the datetime when the lock was obtained
lock_release_node (line 642)

release lock on a node

this is a wrapper around lock_release() for unlocking nodes.

  • return: TRUE if locked removed succesfully, FALSE on error or lock not found
  • uses: lock_record()
bool lock_release_node (int $node_id)
  • int $node_id: the primary key of the node record to unlock
logger (line 366)

a simple function to log information to the database 'for future reference'

This adds a message to the table log_messages, including a time, the remote address and (of course) a message. See also the standard PHP-function syslog(). We use the existing symbolic constants for priority. Default value is WLOG_INFO.

Note that messages with a priority WLOG_DEBUG are only written to the log if the global parameter $CFG->debug is TRUE. All other messages are simply logged, no further questions asked.

If the caller does not provide a user_id, this routine attempts to read the user_id from the global $_SESSION array, i.e. we try to link events to a particular user if possible.

Note that with a field definition of varchar(150) there is room to store either an IPv4 address (max 15 bytes) or a full-blown IPv6 address (39-47 bytes, see RFC3989) or even twice a complete reverse DNS address (see update_core_2011092100()).

See also task_logview() for a rant on the difference between LOG_DEBUG and LOG_INFO.

  • return: FALSE on error, TRUE on success
  • todo: should we make this configurable and maybe log directly to syslog (with automatic logrotate) or do we want to keep this 'self-contained' (the webmaster can read the table, but not the machine's syslog)?
  • usedby: login_send_bypass()
  • usedby: login_send_laissez_passer()
  • uses: $CFG
bool logger (string $message, [int $priority = WLOG_INFO], [ $user_id = ''])
  • string $message: the message to write to the log
  • int $priority: loglevel, see PHP-function syslog() for a list of predefined constants
  • $user_id
magic_unquote (line 83)

this circumvents the 'magic' in magic_quotes_gpc() by conditionally stripping slashes

Magic quotes are a royal pain for portability. If magic quotes are enabled, this function reverses the effect. There are three PHP-parameters in php.ini affecting the magic:

  • the directive 'magic_quotes_runtime'
  • the directive 'magic_quotes_gpc'
  • the directive 'magic_quotes_sybase'
This routine deals with undoing the effect of the latter two. The effect of magic_quotes_runtime can be undone via set_magic_quotes_runtime(0). This is done once at program start (See initialise() in init.php).

This routine should be used to unquote strings from $_GET[], $_POST[] and $_COOKIE whenever they are needed.

Important note: because third party subsystems may deal with magic quotes on their own, it is a Bad Idea[tm] to globally replace the contents of $_GET[], $_POST[] and $_COOKIE with the unescaped values once at program start. Any subsystem would be confused if magic_quotes_gpc() indicates that the magic is in effect whereas in reality the magic was already undone at program start. Yes, this yields a performance penalty, but this magic was a mess right from the start. Hopefully PHP6 will get rid of this magic for once and for all...

  • return: the unescaped string
string magic_unquote (string $value)
  • string $value: a string value that is conditionally unescaped
performance_get_queries (line 600)

return the number of database queries that was executed

  • return: the number of queries
  • uses: $DB
int performance_get_queries ()
performance_get_seconds (line 612)

return the script execution time

  • return: interval between begin execution and now
  • todo: maybe we should get rid of this $PERFORMANCE object, because it doesn't do that much anyway
double performance_get_seconds ()
quasi_random_string (line 282)

generate a string with quasi-random characters

This generates a string of $length quasi-random characters. The optional parameter $candidates determines which characters are elegible. Popular choices for $candidates are:

  • 10 (minimum): use only digits from 0,...,9
  • 16: use digits 0,...9 or letters A,...F
  • 36 (default): use digits 0,...,9 or letters A,...,Z
  • 62: use digits 0,...,9 or letters A,...,Z or letters a,...,z
If $candidates is smaller than 10, 10 is used, if $candidates is greater than 62 62 is used.

Note that this is an ASCII-centric routine: we only use plain ASCII letters and digits and nothing of the 64000 other UNicode characters in the Basic Multilingual Plane. The reason is simple: 7-bit ASCII characters have the best chance of getting through communiocation channels unmangled so there.

void quasi_random_string (int $length, [int $candidates = 36])
  • int $length: length of the string to generate
  • int $candidates: number of candidate-characters to choose from
queue_area_node_alert (line 891)

add a message to message queue of 0 or more alerts

this adds $alert_message to the message buffers of 0 or more alert accounts The alerts that qualify to receive this addition via the alerts_areas_nodes table. The logic in that table is as follows:

  • the area_id must match the area_id(s) (specified in $areas) OR it must be 0 which acts as a wildcard for ALL areas
  • the node_id must match the node_id(s) specified in $nodes) OR it must be 0 which acts as a wildcard for ALL nodes
Also the account must be active and the flag for the area/node-combination must be TRUE.

As a rule this routine is called with a single area_id in $areas and a collection of node_id's in $nodes. The nodes follow the path up through the tree, in order to alert accounts that are only watching a section at a higher level.

Example: If user 'webmaster' adds new page, say 34, to subsection 8 in section 4 in area 1, you get something like this:

queue_area_node_alert(1,array(8,4,34),'node 34 added','webmaster');

The effect will be that all alerts with the following combinations of area A and node N have the message added to their buffers: A=0, N=1 - qualifies for all nodes in all areas A=1, N=0 - qualifies for all nodes in area 1 A=1, N=4 - qualifies for node 4 in area 1 A=1, N=8 - qualifies for node 8 in area 1

It is very well possible that no message is added at all if there is no alert watching the specified area and node (using wildcards or otherwise).

cron.php is to take care of eventually sending the queued messages.

Note that this routine adds a timestamp to the message and, if it is specified, the name of the user.

Also note that the messages are added to the buffer with the last message at the top, it means that the receiver will travel back in time reading the collection of messages. This is based on the assumption that the latest messages sometimes override a previous message and therefore should be read first.

  • uses: $DB;
void queue_area_node_alert (mixed $areas, mixed $nodes, string $alert_message, [string $username = ''])
  • mixed $areas: an array or a single int identifying the area(s) of interest
  • mixed $nodes: an array or a single int identifying the node(s) of interest
  • string $alert_message: the message to add to the buffer of qualifying alert accounts
  • string $username: (optional) the name of the user that initiated the action
quoted_printable (line 1694)

convert string $s from native format to quoted printable (RFC2045)

this converts the input string $s to quoted printable form as defined in RFC2045 (see http://www.ietf.org/rfc/rfc2045.txt). By default this routine assumes a line-oriented text input. This can be overruled by calling the routine with the parameter $textmode set to FALSE: in that case the input is considered to be a binary string with no embedded newlines.

The routine assumes that the input lines are delimited with $newline. By default this parameter is a LF (Linefeed) but it could be changed to another delimiter using the function parameter $newline.

According to RFC2045 the resulting output lines should be no longer than 76 bytes, even though it is very well possible to use shorter lines. This can be done by setting the parameter $max_length to the desired value. Note that this value is forced to be in the range 4,...,76.

The encoding is defined in section 6.7 of RFC2045 with these five rules.

(1) General 8bit representation: any character may be represented as "=" followed by two uppercase hexadecimal digits.

(2) Literal representation characters "!" to "~" but excluding the "=" may represent themselves.

(3) White space Space " " and tab "\t" at the end of a line must use rule (1); in all other cases either rule (1) or (2) may be applied.

(4) Line breaks The (hard) line breaks in the input must be represented using "\r\n" in the output.

(5) Soft line breaks Output lines may not be longer than 76 bytes. This can be enforced by inserting a soft line break (the string "=\r\n") in the output. This soft line break will disappear once the encoded string is decoded.

The basic conversion algoritm is constructed using two important variables:

  • an integer value ($remaining) indicating the number of bytes left in the current output line
  • a boolean flag ($next_is_newline) indicating if the next input character is a $newline
The variable $remaining keeps track of situations where the current character (either as (1) General 8bit representation or (2) Literal representation) might not fit on the current line (eg. 2 bytes left requires an 8bit representation to be moved to the next output line). The flag $next_is_newline is used to make the best posible use of the available remaining space in the output, eg. if the current character is exactly as long as the remaining space, we can output that character on the current output line, because we are sure that it is the last character on the current output line so there cannot be a soft return next.

Note that spaces (ASCII 32) and tabs (ASCII 9) are treated differently depending on their position in the line. The rule is that both should be represented as "=20" or "=09" at the end of an input line and that it is allowed to use " " or "\t" when NOT at the end of an input line. In the latter case, the output line will allways end with a soft line break "=\r\n" which makes sure that there are not trailing spaces/tabs in the output line anyway.

Also note that the end of the input $s is also flagged via setting $next_is_newline. This is an optimalisation which treats spaces and tabs at the end of the input as if they were at the end of an input line, ie. converting to "=20" or "=09". This means that the output will never end with a space of a tab, even if the input does.

Note that in case of a binary conversion the input character(s) that might otherwise indicate a newline are to be considered as binary data. However, if the data is completely binary, it probably doesn't make sense to use Quoted-Printable in the first place (base64 would probably be a better choice).

Reference: see http://www.ietf.org/rfc/rfc2045.txt.

  • return: encoded string according to RFC2045
  • todo: should we change the code to accomodate the canonical newline CRLF in the input?
string quoted_printable (string $s, [bool $textmode = TRUE], [string $newline = &quot;\n&quot;], [int $max_length = 76])
  • string $s: source string
  • bool $textmode: TRUE means newlines count as hard line breaks, FALSE is binary data
  • string $newline: native character indicating end of line
  • int $max_length: indicates the limit for output lines (excluding the CRLF)
redirect_and_exit (line 573)

redirect to another url by sending an http header

nothing redirect_and_exit (string $url, [ $message = ''])
  • string $url: the url to redirect to
  • $message
replace_crlf (line 315)

unfold a possible multiline string

This removes all linefeeds and carriage returns from a string Typical use would be to strip a subject line in a mailmessage from newlines which might interfere with proper sending of mail headers.

  • return: the string with offending characters replaced
string replace_crlf (string $multiline_string, [string $replacement = ''])
  • string $multiline_string: the multiline string to strip
  • string $replacement: (optional) the string to replace newlines
sanitise_filename (line 1556)

sanitise a string to make it acceptable as a filename/directoryname

this routine analyses and maybe converts the input string as follows:

  • all leading and trailing dots, spaces, dashes, underscores, backslashes and slashes are removed
  • all embedded spaces, backslashes and slashes are converted to underscores
  • only letters, digits, dots, dashes or underscores are retained
  • all sequences of 2 or more underscores are replaced with a single underscore
  • finally all 'forbidden' words (including empty string) get an underscore prefixed
Note that this sanitising only satisfies the basic rules for filenames; creating a new file with a sanitised name may still clash with an existing file or subdirectory.

Also note that a full pathname will yield something that looks like a simple filename without directories or drive letter: C:\Program Files\Apache Group\htpasswd becomes C_Program_Files_Apache_Group_htpasswd and /etc/passwd becomes etc_passwd. Also this routine makes a URL look like a filename: http://www.example.com becomes http_www.example.com.

Finally note that we don't even attempt to transliterate utf8-characters or any other characters between 128 and 255; these are simply removed.

  • return: sanitised filename which is never empty
  • todo: should we check for overlong UTF-8 encodings: C0 AF C0 AE C0 AE C0 AF equates to /../ or is that dealt with already by filtering on letters/digits and embedded dots/dashes/underscores?
string sanitise_filename (string $filename)
  • string $filename: the string to sanitise
short_datim (line 2299)

construct an abbreviated date/time from a full date/time string

this converts the date/time 'yyyy-mm-dd hh:mm:ss' into either

  • hh:mm (for today's dates)
  • Ddd hh:mm (for the past week)
  • yyyy-mm-dd (for the rest)

  • return: abbreviated date/time
string short_datim (string $dt)
  • string $dt: date/time to convert into short abbreviated form
string2time (line 225)

convert a string representation of a date/time to a timestamp

this is a crude date/time parser. We collect digits and convert to integers. With the integers we fill an array with at least 6 integers, corresponding to year, month, day, hours, minutes and seconds. If there are less than six numbers in the source string the value 0 is used. for the remaining elements. Note that a number in this context is always a non-negative number because a dash (or minus) is considered a delimiter.

Note that valid date/time values are limited to how many seconds can be represented in a signed long integer, where 0 equates to 1970-01-01 00:00:00 (the Unix epoch). The upper limit for a 32-bit int is some date in 2038 (only 30 years from now).

  • return: unix timestamp (second since epoch) or FALSE on error
bool|long string2time (string $timestring)
  • string $timestring: date/time in the form yyyy-mm-dd hh:mm:ss
t (line 332)

translation of phrases via a function with a very short name

This is only a wrapper function for $LANGUAGE->get_phrase()

  • return: translated string with optional values from array 'replace' inserted
  • uses: $LANGUAGE
string t (string $phrase_key, [string $full_domain = ''], [array $replace = ''], [string $location_hint = ''], [string $language = ''])
  • string $phrase_key: indicates the phrase that needs to be translated
  • string $full_domain: (optional) indicates the text domain (perhaps with a prefix)
  • array $replace: (optional) an assoc array with key-value-pairs to insert into the translation
  • string $location_hint: (optional) hints at a directory location of language files
  • string $language: (optional) target language
tree_build (line 1112)

construct a tree of nodes in memory

this reads the nodes in the specified area from disk and constructs a tree via linked lists (sort of). If parameter $force is TRUE, the data is read from the database, otherwise a cached version is returned (if available).

Note that this routine also 'repairs' the tree when an orphan is detected. The orphan is automagically moved to the top of the area. Of course, it shouldn't happen, but if it does we are better off with a magically _appearing_ orphan than a _disappearing_ node.

A lot of operations in the page manager work with a tree of nodes in some way, e.g. walking the tree and displaying it or walking the tree and collecting the sections (but not the pages), etc.

The tree starts with a 'root' with key 0 ($tree[0]). This is the starting point of the tree. The nodes at the top level of an area are linked from this root node via the field 'first_child_id'. If there are no nodes in the area, this field 'first_child_id' is zero. The linked list is constructed by using the node_id. All nodes in an area are collected in an array. This array us used to construct the linked lists.

Every node has a parent (via 'parent_id'), where the nodes at the top level have a parent_id of zero; this points to the 'root'. The nodes within a section or at the top level are linked forward via 'next_sibling_id' and backward via 'prev_sibling_id'. A zero indicates the end of the list. Childeren start with 'first_child_id'. A value of zero means: no childeren.

The complete node record from the database is also stored in the tree. This is used extensively throughout the pagemanager; it acts as a cache for all nodes in an area.

Note that we cache the node records per area. If two areas are involved, the cache doesn't work very well anymore. However, this doesn't happen very often; only in case of moving nodes from one area to another (and even then).

  • return: contains a 'root node' 0 plus all nodes from the requested area if any
  • todo: what if we need the trees of two different areas? should the static var here be an array, keyed by area_id?
  • todo: repairing a node doesn't really belong here, in this routine. we really should have a separate 'database repair tool' for this purpose. someday we'll fix this....
array tree_build (int $area_id, [bool $force = FALSE])
  • int $area_id: the area to make the tree for
  • bool $force: if TRUE forces reread from database (resets the cache)
tree_visibility (line 1216)

calculate the visibility of the nodes in the tree

this flags visible nodes as visible. Here 'visible' means that

  • the node is not hidden, not expired and not under embargo
  • the section has at least 1 visible node (page or section)
As a side effect, any subtree starting at a hidden/expired/embargo'ed section is completely set to invisible so we don't risk the chance to accidently show a page from an invisible section. This routine walks through the tree recursively.

  • return: TRUE when there is at least 1 visible node, FALSE otherwise
  • todo: how about making all nodes under embargo visible when previewing a page or at least the path from the node to display?
bool tree_visibility (int $subtree_id, array &$tree, [bool $force_invisibility = FALSE])
  • int $subtree_id: the starting point for the tree walking
  • array &$tree: pointer to the current tree
  • bool $force_invisibility
userdir_delete (line 2193)

remove an 'empty' directory that used to contain (user)files

this removes the left-over files in the directory $CFG->datadir.$path and subsequently the directory itself. The allowable left-over files are those that are skipped in userdir_is_empty(). The (user) files we look at are those that are filtered out: - . and .. (directory housekeeping) - index.html of 0 bytes ('protects' directory from prying eyes) - symbolic links - thumbnails (filenames starting with THUMBNAIL_PREFIX) This filtering is the same as that in the file manager (see filemanager.class.php).

Note that any symbolic links are deleted too.

  • return: TRUE on success, FALSE otherwise
bool userdir_delete (string $path)
  • string $path: the directory path relative to $CFG->datadir, e.g. '/areas/exemplum' or '/users/acackl'
userdir_is_empty (line 2145)

determine whether a directory is empty (free from (user)files)

this scans the directory $CFG->datadir.$path to see if it is empty, i.e. does not contain any (user)files. Returns TRUE if empty, FALSE otherwise. The (user) files we look at are those that are not filtered out: - . and .. (directory housekeeping) - index.html of 0 bytes ('protects' directory from prying eyes) - symbolic links - thumbnails (filenames starting with THUMBNAIL_PREFIX) This filtering is the same as that in the file manager (see filemanager.class.php).

  • return: TRUE if no (user)files or subdirectories exist in $path, FALSE otherwise
bool userdir_is_empty (string $path)
  • string $path: the directory path relative to $CFG->datadir, e.g. '/areas/exemplum' or '/users/acackl'
was_file_url (line 1895)

construct a url that links to a file via /file.php

This constructs a URL that links to a file, either

    /file.php/path/to/file.txt
or
    /file.php?file=/path/to/file.txt
depending on the global setting for proxy-friendly urls.

Furthermore, if the flag $fully_qualified is TRUE, we include scheme and authority in the resulting URL, ie.

    http://exemplum.eu/file.php/path/to/file.txt

string was_file_url (string $path, [bool $fully_qualified = FALSE])
  • string $path: the name of the file including path
  • bool $fully_qualified: if TRUE forces the URL to contain a scheme, authority etc., else use shortened form
was_node_url (line 1957)

construct a ready-to-use href which links to the node $node via index.php

this routine creates a ready-to-use href that links to node $node, taking these options into account:

  • the href is replaced with a bare '#' if we area in preview mode
  • the href is either fully qualified or abbreviated, depending on $qualified
  • the node_id is conveyed either as a proxy-friendly url or a simple parameter ?node=$node_id
  • if we use friendly url, $node_id always comes first in the path (without the word 'node')
  • if we use friendly url, $bookmark is appended as the last item in the path
  • the additional parameters (if any) are sandwiched between the node_id and the bookmark
This routine mainly deals with constructing a friendly url taking parameters into account in the form of path components. The node_id is conveyed as the first parameter and it has no associated name, ie. the url is shortened from '/was/index.php/node/35' to '/was/index.php/35'. All other parameters from the array $parameters (if any) are added as pairs: '/key1/value1/key2/value2/key3/value3' etc. The last parameter added to this path is based on the $bookmark or, if that is empty the node's title. The purpose of this parameter is to create a URL that looks like a descriptive filename, which makes it easier for the visitor to bookmark this page and still have a clue as to what the page is about. Otherwise this parameter is not used at all; the 'real' navigation information is in the node_id and the additional parameters. Example:
    was_node_url($tree[35]['record'],array('photo'=>'5'),'Picture of our field trip')
yields the following URL (when friendly urls are used):
    /was/index.php/35/photo/5/Picture_of_our_field_trip.html
or (when friendly urls are not used):
    /was/index.php?node=35&photo=5
The interesting bits are the node_id (35) and the photo_id (5). The string alias filename 'Picture_of_our_field_trip.html' is merely a suggestion to the browser and is not used by W@S.

string was_node_url ([array $node = NULL], [array|null $parameters = NULL], [string $bookmark = ''], [bool $preview = FALSE], [bool $qualified = FALSE])
  • array $node: record straight from the database (or $tree)
  • array|null $parameters: additional parameters for the url (path components in friendly url mode)
  • string $bookmark: the basis for a visual clue to identify the node (in friendly url mode only)
  • bool $preview: if TRUE, the href is replaced with a bare '#' to obstruct navigation in preview mode
  • bool $qualified: if TRUE use the scheme and authority, otherwise use the short(er) form without scheme/authority
was_url (line 1853)

massage a possibly relative URL to make it more qualified

Here we perform some heuristics: if $url looks like it is relative, we prepend the correct path (from $CFG) to it.

Here a URL is considered relative when it does NOT start with a slash and it does NOT start with a scheme followed by '://'. Additionaly, we make a distinction between a relative URL starting with 'program/' (which indicates a static file somewhere in the program directory) and other relative URLs (which are assumed to start in the CMS Root Directory (the directory where index.php, admin.php & friends live).

Note: according to RFC3986 a scheme must start with a letter and can contain only letters, digits, '+', '-' or '.'. Note: all string operations here are ASCII; no UTF-8 issues here.

If $fully_qualified is TRUE we always make a relative URL fully qualified.

If $url starts with a slash, we must assume that the caller means some file relative to the document root, or rather: relative to the top level directory embedded in $CFG->www. If we have $url starting with a slash AND $full_qualified is TRUE, we extract the scheme and authority from $CFG->www and prepend that to $url. This is a heuristic approach.

Example 1: 'program/styles/base.css' becomes '/program/styles/base.css' OR 'http://exemplum.eu/program/styles/base.css'

Example 2: 'file.php/areas/exemplum/logo.jpg' becomes '/file.php/areas/exemplum/logo.jpg' OR 'http://exemplum.eu/file.php/areas/exemplum/logo.jpg'

Example 3: '/path/to/foo/bar/logo.jpg' becomes '/path/to/foo/bar/logo.jpg' OR 'http://exemplum.eu/path/to/foo/bar/logo.jpg'

string was_url (string $url, [bool $fully_qualified = FALSE])
  • string $url: the (possibly) relative URL to massage
  • bool $fully_qualified: if TRUE forces the URL to contain a scheme, authority etc., else use shortened form

Documentation generated on Tue, 28 Jun 2016 19:12:41 +0200 by phpDocumentor 1.4.0