Archive for May, 2009

While working on my Google Summer of Code project today, I came across a bug that pretty much halted my productivity for the day.

Early on in my project, I decided that working with Unicode is hard, among other things. Since I was restricted to using C, I had to find a way to easily manipulate Unicode stuff, and I came across GLib (I’m not even entirely sure how, I think I just remember other projects using it, and decided to look it up.)

Not only did it have Unicode handling stuff, it also provides a bunch of convenient things like a linked list implementation, memory allocation, etc. All in a way intended to be cross-platform compatible, since this is the thing that’s used to power Gtk.

I’m not entirely sure how it differs from Apache’s Portable Runtime (APR); maybe it’s even a Not Invented Here syndrome. In any case, I, not suffering from that particular affliction, decided to be lazy and re-use existing code.

For some reason, GLib’s g_slice_alloc() call was failing. For those of you that don’t know what this does, it is similar to malloc() in standard C – it allocates a chunk of memory and returns it to you, so that you can make use of dynamic memory allocation, rather than everything just being auto variables. In particular, it means you can be flexible and allocate as much or as little memory as you need.

So I spent the entire day trying to figure out why my program was segfaulting. Looking at the output of gdb (the GNU Debugger), the backtrace showed that it was crashing at the allocation statement. No way, I thought, that test is so well-tested, it must be a problem with the way I’m using it.

I changed the code to use malloc() instead of g_slice_alloc(), and the program started crashing right away, rather than after four or five executions with g_slice_alloc(). Not exactly useful for debugging.

I mentioned my frustration with C on the Debian Perl Packager Group channel, and a friend from the group, Ryan Niebur took a look at the code (accessible via a public repository). After a bit of tinkering, he determined that the problem was that I was using g_slice_alloc instead of g_slice_alloc0, which automatically zeroes memory before returning it.

It stopped the crashing and my program works as intended. I’m still left totally puzzled as to why this bug was happening, and I’m not sure if malloc isn’t supposed to be used with structs, or some other limitation like that.

But thanks the magic of open source and social coding/debugging, the expertise of many contribute to the success of a single project. It’s such a beautiful thing.

Update: There were a lot of questions and comments, mainly relating to the fact that malloc and friends return chunks of memory that may not have been zeroed.

Indeed, this was the first thing I considered, but the line it happened to be crashing on was a line that pretty much just did g_slice_alloc, rather than any of the statements after that.

For those that are curious, all of the code is visible in the public repository.

I do realize that the fixes that have been made are pretty much temporary and that they are probably just masking a bigger problem. However, I’m at a loss for the issue is. Hopefully the magic of open source will work for me again, and one of the many people who have commented will discover the issue.


Read Full Post »

The problem with secret questions is that they’re not very secretive. Back in the 1990s, it might have been reasonable to assume that information like “What was your first school?” might be secret. However, now, in the days of MySpace and Facebook, and especially Google, it’s fairly easy to find information about people.

A problem with passwords is that people forget them. On the other hand, a problem with other methods of authentication like public key cryptography is that they are not portable. Moreover, the problem with solutions like RSA’s SecurID token is that they can be lost.

Generally, the wisdom for truly secure systems is to combine these various approaches, in what is called Multi-Factor Authentication. Wikipedia describes these as:

However, usually a password is sufficient because, if well chosen, and if the rest of the system is secure, they can provide a reasonable balance between everything. Yes, they do require memorizing, but on the other hand, you’ll never lose it unless you forget it. They are also weak because you can’t know if somebody else shoulder-surfs and catches your password, or has a keylogger installed on a public workstation.

Worse, if you are using an unencrypted wireless network connection such as the one we have at Western, then your data (including your password) travels over-the-air in plaintext. I’ve actually tried sniffing packets using Wireshark over the wireless network, and captured quite a bit of stuff, all without having to log into the system. For those in the know, there is also a secure network, but it’s significantly harder to set up – it uses WPA2 Enterprise – and though Windows XP and Vista both support it, the added cost of setting it up doesn’t seem worthwhile to most people. But all this is the subject of another article.

Because secret questions are often used for password recovery in the event that your password is lost or your account is compromised (by an attacker who missed the option to change your secret question), they are essentially a second password on your account.

Why are two passwords bad? They effectively double the chances that one of them will be cracked; if attackers find that your actual password is too difficult to crack, they might look at your secret question instead. Worse, because the question is considered “public” information (how are you supposed to remember the answer, without being given a question, after all), then attackers have a context for your password.

Imagine having this as your real password. “Hint: It’s the name of your first son.” People then don’t need to know you very well at all in order to figure out your password. Worse, most people will pass this sort of information without knowing it in conversation. Find out what someone’s secret question is? Steer the conversation there. “Got any kids? What’s his name?” etc.

This social engineering is particularly dangerous because while people know their passwords are precious, they are less likely to even remember their secret question and wouldn’t protect that information too much anyway.

Some better systems have opted to verifying e-mail addresses, but then the e-mail account becomes the weakest link. If a user’s e-mail account gets hacked, then the attacker thus has access to all their other accounts through the “Send the password to my e-mail” feature.

There are lots of solutions to this, but I think the lesson learned is that human factors play the largest and most often overlooked part of software design, especially web applications. For software security to improve, programmers and designers need to be a lot more careful – validate all input from forms/parameters, sanitize output that goes to users’ browsers to eliminate Cross Site Scripting (XSS) risks. It’s really just a reminder that a little bit of paranoia can go a long way to protecting the end-user.

Read Full Post »

As energy demand continues to rise, cost-effective delivery of electric power becomes a daunting task.  Many jurisdictions responded by introducing legislation to privatize the power generation industry, so that large networks of systems with multiple different owners could share the load.  As remote power systems add more interconnections, new challenges are emerging related to overall power system stability, particularly in relation to distributed generation as with renewable power sources in the home.

Traditionally, engineers integrated power systems under the assumption that power consumption increases gradually such that operators can simply add generation capacity to meet demand and one can consider the system relatively unchanging with time.  Similarly, operators could compensate for changes in the overall power profile by adding inductors or capacitors to substations depending on the typical load.  For example, a substation supplying power to an industrial mill would consume reactive volt-amperes (VARs) so the local utility would add a capacitor to compensate for the inductive load, in order to preserve voltage regulation.  However, once the mill is no longer operating (for example, at night), the resulting reduction of load causes a rise in the supply voltage, which can be well above the desired voltage.

Flexible AC Transmission Systems (FACTS) are different because, as the name implies, they are flexible: designed to be dynamically adjusting to the power demand and other conditions of power quality.  A basic installation might consist of an operator- or microprocessor-controlled bank of capacitors can consume reactive power when necessary.  The state of the art is to provide continuous switching using power electronic devices, which have a much faster response time than a human operator or even a microprocessor-based control system.  Novel devices even filter harmonic oscillations, which can significantly reduce power flow and cause stability issues.

Optimum usage of current transmission line assets is the most cost effective option available and FACTS devices allow utilities to provide greater power delivery with better system stability and power quality.  It is often prohibitively expensive to build new power lines, so these devices provide a stopgap measure capable of delivering increased capacity while also reducing transmission losses.

This article was taken from a report which I co-authored. It was submitted to ECE3333: Power Systems I, taught by Professor Rajiv Varma at the University of Western Ontario in Spring 2009.

Read Full Post »

They say that premature optimization is the root of all evil, and they are right. Most likely, this code performs as well as, or potentially even worse than, checking that strcmp(s1, s2) == 0.

Nonetheless, while working on my Google Summer of Code project, I needed to test that two strings are equal while ignoring the case. As a result I wrote a simple algorithm called strieq(s1,s2) which simply returns 1 if the strings are equal, and 0 otherwise. This differs from strcmp because strcmp provides the stuff necessary for an actual lexicographical comparison, which my particular application didn’t require.

So here I publish my code for your viewing pleasure. Possibly you will be able to use it, and while I’ve tested it and tried to make it safe, I’m not an extremely experienced C coder, so I may have missed something. If there is a bug, please report it (via e-mail or in the comments section) so that I can update it.

int strieq(const char *, const char *);
static inline char lower(char);

/* strieq(const char *str1, const char *str2)
 * This function takes two ASCII strings and compares them for equality in a
 * case-insensitive manner. It doesn't work with Unicode strings -- that is,
 * it will only fix case for normal ASCII strings. Everything else is passed
 * through as-is.
int strieq(const char *str1, const char *str2)
  while (TRUE)
    if ((*str1 == '' && *str2 != '') || (*str1 != '' && *str2 == ''))

    /* From the above test we know that they are either both non-null or both
     * strings are at the end. The latter case means we have a match.
    else if (*str1 == '')
      return TRUE;

    /* Check if the lowercased version of the characters match */
    if (lower(*str1) != lower(*str2))

    str1++; str2++;
  return FALSE;

/* lower(char ch)
 * This function takes a single alphabetic ASCII character and makes it
 * lowercase, useful for comparison.
static inline char lower(char ch)
  /* Check that ch is between A and Z */
  if (ch >= 'A' && ch <= 'Z')
    return (ch + ('a' - 'A'));
  /* Otherwise return it as-is */
  return ch;

In case this is an issue:

I, the copyright holder of the above snippet for stricmp and lower, hereby release the entire contents therein into the public domain. This applies worldwide, to the extent that it is permissible by law.

In case this is not legally possible, I grant any entity the right to use this work for any purpose, without any conditions, unless such conditions are required by law.

Update: Small fixes to documentation and code; WordPress must have stripped characters when I first inserted the code.

Read Full Post »

The first part of my Google Summer of Code project involves the creation of a library to handle Debian package control files, which are, upon closer inspection of the Debian Policy Manual, actually encoded as UTF-8 (8-bit Unicode).

Initially, the files just “looked” like ASCII (a rather common issue with a downward-compatible system like UTF-8). You see, UTF-8 is, by design, indistinguishable from ASCII unless high-order characters are used – this is so that files using only ASCII characters can still be interpreted as UTF-8.

All of this meant that there is the possibility of “wide characters” – that is, characters that require multiple bytes to render, such as those in other languages. This means that using C would become a bit tedious, as you have to handle these cases.

I had read about the GNOME Project’s GLib but not looked at it in any depth until now. Much to my surprise, I discovered that it is an entire framework of portable C code, providing I/O manipulation, string handling, common data structures like trees and hashes, among other things. These functions all provide a Unicode-safe system too, which are all manipulated internally in UTF-32 and written back out in UTF-8. I’m all about code reuse and, being lazy and not completely understanding Unicode in-depth (after all, there is the common statement that “Internationalisation is hard”), I decided that using GLib was the best way to go.

The unfortunate side effect of this is a bit of wasted space – even if all the characters are 8-bits wide, 32-bits will be required to store them, meaning 24-bits of wasted space per character. An 8kb ASCII file is roughly 8,000 characters, meaning 192kb of space are wasted for this file, which could very well be a moderately complex Debian control file. All in all, it’s not a big deal and can be converted to UTF-8 later if desired.

Admittedly, the documentation available online in the GNOME Library – the GLib Reference Manual – leaves something to be desired. In particular, each function is explained in terms of its parameters, input and output, but does not provide a trivial example of its use. Some of the functions are unclear on how they work — when manipulating strings in GLib 2.20, the documentation describes functions with a signature like:

GString* g_string_append_len(GString *string, const gchar *val,
gssize len);

In this case, it’s unclear what the returned GString is useful for. Generally, C functions like strcat (don’t use that, by the way, use strlcat instead) will update strings in-place rather than returning pointers to them.

On the other hand, without knowing the intentions of the developers, it might be designed for portability to create bindings in languages that do not have the notion of returning through a pointer, such as Perl. However, in my opinion, the particulars of that should depend on the implementation of the bindings, as in, the XS glue code should have provisions for this.

Update: From reading the body of other functions, it seems that the return value is used to allow nested calls, like: g_string_ascii_down(g_string_append_len(…)). It’s sort of neat, really; shows some design foresight in the library.

Read Full Post »

After being introduced to the problem of power quality issues, one might wonder what the real implications are, particularly for residential consumers. This article explains what voltage fluctuations, harmonic oscillations and transients really are, and why they are important considerations for any Electrical Engineer.

These are closely related to a forthcoming article on Flexible AC Transmission Systems.

Voltage Fluctuations

When the receiving terminal of a transmission line operates a high-power load drawing a large amount of current, the system voltage tends to drop, leading to an undervoltage condition colloquially known as voltage sag.  This can have undesirable effects for consumers, since devices may malfunction and particularly sensitive equipment such as electronics may not work at all.  Factories and other industrial plants often consume large amounts of power, so they can cause frequent and prolonged periods of undervoltage if left unchecked.

Conversely, when the receiving end has a lower load than expected (known as load rejection), the voltage can exceed the nominal voltage by a significant margin.  This can happen when a large load is suddenly disconnected from the grid, such as when a factory’s circuit breaker trips.  During periods of low load on the transmission line, the line voltage increases along the length of the line due to the line charging capacitance.

Whatever the cause, installation of FACTS devices can correct voltage fluctuations without re-quiring manual intervention by agents of the power utility.  In fact, this is the primary function and advantage conferred by parallel (shunt) compensation devices and will be the topic of a future post.

Harmonics and Transients

Some loads, such as rectifiers, are non-linear in nature and can result in a distortion of the ideal sinusoidal waveform shape.  In other instances, disturbances such as lightning hitting power transmission lines or a sudden transient fault like a power line swaying in the wind and hitting a tree can interrupt power delivery if not detected and counteracted.  The problem of trees in transmission line paths is a particular concern because impacts with them can cause blackouts and complete system failures, as happened during the Northeast Blackout of 2003.

Harmonics in the power waveform can cause equipment damage or malfunction, and more importantly, can cause the power system to become unstable.  Thus, to raise the dynamic stability limit of the power system, devices must be in place to handle harmonic and transient disturbances. Luckily, craft engineers and scientists have developed a FACTS configuration useful for controlling these problems as well.

This article was taken from a report which I co-authored. It was submitted to ECE3333: Power Systems I, taught by Professor Rajiv Varma at the University of Western Ontario in Spring 2009.

Read Full Post »

The primary mission of the power utility is to deliver power to consumers reliably and cost-effectively.  Ultimately, consumers do not care about the intricacies of managing power systems, but they require that the system “works.”  For the average consumer, whether a power system works or not depends largely on the design of their devices, which can vary based on things like government legislation.

To that end, standards bodies closely regulate the power utility, specifying things like the nominal bus voltage, frequency and allowed range of variability.  Collectively, the difference between these parameters and their nominal values is power quality.

Voltage, power factor, harmonics and managing transients are the important attributes of received power, and the utility must ensure that these are consistent even amidst disturbances in the line.  One common disturbance that can often cause power system stability issues is lightning strikes, which cause a temporary surge in transmission line voltage. Since loads vary with time, the utility cannot guarantee power quality without designing resilient systems capable of dynamically varying with the needs of the system.

In the ideal situation, voltage should appear as a sinusoidal wave at exactly the mains voltage and frequency. The image below provides a graphical illustration of the four most common power quality issues that utilities must work to eliminate to the extent possible.

Power Quality Problems

The cause, behaviour and effects of each of these will be the topic of future posts.

This article was taken from a report which I co-authored. It was submitted to ECE3333: Power Systems I, taught by Professor Rajiv Varma at the University of Western Ontario in Spring 2009.

Read Full Post »