Feeds:
Posts
Comments

Posts Tagged ‘Google Summer of Code’

Last year, I had a great time participating in the Google Summer of Code with the Debian project. I had a neat project with some rather interesting implications for helping developers to package and maintain their work. It’s still a work-in-progress, of course, as many projects in open source are, but I was able to accomplish quite a bit and am proud of my work. I learned quite a bit about coding in C, working with Debian and met some very intelligent people.

My student peers were also very intelligent and great to learn from. I enjoyed meeting them virtually and discussing our various projects on the IRC channel as the summer progressed and the Summer of Code kicked into full swing. The Debian project in particular also helps arrange travel grants for students to attend the Debian Conference (this year, DebConf10 is being held in New York City!). DebConf provides a great venue to learn from other developers (both in official talks but also unofficial hacking sessions). As the social aspect is particularly important to Debian, DebConf helps people meet those with whom they are working with the most, thereby creating lifelong friendships and making open source fun.

I have had several interviews for internships, and the bit of my work experience most asked about is my time doing the Google Summer of Code. I really enjoyed seeing a project go from the proposal stage, setting a reasonable timeline with my mentor, exploring the state of the art, and most importantly, developing the software. I think this is the sort of indispensible industry-type experience we often lack in our undergrad education. We might have an honours thesis or presentation, but much of the work in the Google Summer of Code actually gets used “in the field.”

Developing software for people rather than for marks is significant in a number of ways, but most importantly it means there are real stakeholders that must be considered at all stages. Proposing brilliant new ideas is important, however, without highlighting the benefits they can have for various users, the reality is that it simply will not gain traction. Learning how to write proposals effectively is an important skill and working with my prospective mentor (at the time – he later mentored my project once it was accepted) to develop mine was tremendously useful for my future endeavours.

The way I see the Google Summer of Code is, in many ways, similar to an academic grant (and the stipend is about the same as well). It provides a modest salary (this year it’s US$5000) but more importantly, personal contact with a mentor. Mentors are typically veterans in software development or the Debian project and act in the same role as supervisors for post-graduate work: they help monitor your progress and propose new ideas to keep you on track.

The Debian Project is looking for more students and proposals. We have a list of ideas as well as application instructions available on our Wiki. As I will be going on internship starting May, I have offered to be a mentor this year. I look forward to seeing your submissions (some really interesting ones have already begun to filter in as the deadline approaches).

Advertisements

Read Full Post »

It’s been some time since I re-installed Debian over my Kubuntu install, so I thought I’d discuss some reasons why I changed back to Debian, what my experience was like, and some learning opportunities.

One reason I made the switch was because there was a utility newly packaged for Debian, Frama-C, which was not available in Kubuntu at the time. It also frustrated me that I was having various frustrations with the installation, not the least of which was an unreliable and quite crashy KDE Plasma.

When I reinstalled this time, I picked the normal install but told it to install a graphical environment, which gave me a GNOME desktop environment. I actually rather like it, the setup didn’t ask too many questions and everything was set up perfectly. There was some minor tweaking, but it was all done by the easily accessible System menu and all the applets therein.

Now, I wanted to be able to use the server both as a virtual machine and as a physical dual-boot. This wasn’t working properly with GRUB-2, so I had to stay with version 1.96, which works rather well. I even spent some time making a pretty splashimage for it, which looks rather nice, even if I don’t see it all that often.

If I boot into the Virtual Machine, all the hardware is detected properly, and there aren’t even complaints about the fact that a bunch of hardware disappeared — certainly very good news if you decide to do something like move your hard drive to a different machine. Likewise, if I boot into the desktop, everything works well there too.

One issue I came across during the installation was having to teach Network-Manager how to configure my network interfaces. In my VMware NAT setup, there is no DHCP server, so the IP address, subnet and gateway information needs to be statically defined. Luckily, Network-Manager was able to do this based on the MAC address of the adapter — inside my virtual machine, it had a VMware-assigned static one. Through this, Network-Manager had an easy way to determine how to configure my network, and it works beautifully for Ethernet and Wireless (when Debian is running as the main operating system) and also for VMware NAT (when inside the virtual machine container).

Anyway, I have now been developing quite happily inside a Debian + GNOME desktop environment. The system runs fine even within a constrained environment, though I miss KDE’s setup with sudo; with GNOME, the option seems to be to have the root password entered every time privilege escalation is necessary. I don’t like using a root password — on my server system I don’t use the root password at all, and do everything I need to do via sudo. It’s okay for me because I log into the server with a private key and have disabled SSH password authentication for security reasons.

One thing that is still weird for me is that my system currently shows a time of 01:53 even though it is 23:57 in my timezone. Presumably the few minutes of difference is because the virtual machine clock and my system hardware clock aren’t synchronized perfectly, but more than that, I think it’s an issue with the Date applet somehow. I haven’t looked into this because the thing is running inside a virtual machine, so it doesn’t bother me much.

I have looked high and low to see where to change the time zone, and to my knowledge the system knows that it’s in the America/Toronto time zone. The house picture next to Timmins (the city I am in right now, though it doesn’t matter since the timezone is the same) seems to indicate to me that it’s set to the appropriate time zone.

I think it’s due to VMware synchronizing the virtual machine clock with my host machine clock. Windows (my host operating system) stores the time in the local format, which I believe Linux thinks is UTC. Still, it doesn’t explain the weird display it’s got going.

Someone noted last time that I didn’t make direct mention of which programs are only offered on Windows and not on Linux/etc, and that do not have reasonable replacement on these systems. Kapil Hari Paranjape noted that I was sounding somewhat like a troll by simply saying that I don’t think Linux is yet ready to replace my environment. Here was my reply:

Far from a troll, I’d really like Debian and Ubuntu, but moreso Linux in general, to improve at the pace it has been doing so. It’s made great progress since the last time I tried it out on my desktop, but I have to acknowledge that there are lots of rough edges right now that should be worked out.

One of the advantages of huge proprietary development organizations like Microsoft is that they have tons of developers and can implement new features at a relatively quick pace, even if they’re half-assed. Developers’ pride in the FOSS community prevents this overly quick pace of development in favour of more secure, more stable platforms. Which is a good thing, I think. But nonetheless it results in a “slower” development pace.

The applications I’m complaining about are things like:
– SolidWorks (a CAD tool for designing parts and assemblies, used in manufacturing and mechanical engineering)
– SpectrumSoft Micro-Cap (a version of software similar to PSpice used by my school)
– AutoCAD (another CAD tool)

Luckily this is changing, but only for the large & most popular distributions:
– MathWorks MATLAB (runs on Linux and Solaris, etc.)
– Wolfram Mathematica (which has versions for Linux and MacOS X)
– FEKO (runs on Linux and Solaris among others)

Anyway, I still consider SolidWorks to be a rather big program not supported on Linux, which is a big issue for those working on Civil Engineering programs. There are most probably others which are very domain-specific that I don’t even know about.

There is a nice matrix comparing cross-platform capabilities of CAD software: http://en.wikipedia.org/wiki/Comparison_of_CAD_software

Oh, one final thought: perhaps that KDE Recommends: should be moved to a Suggests: instead, on account of its heavy dependencies, requiring mysql-server installed on desktop machines.. WTF!

Oh, and on another note, I re-installed Debian using the non-expert Auto Install and it installed Gnome rather flawlessly, much like installing Ubuntu, which was pretty nice. So kudos to those who have been working on the main installer; it seems as though the advanced ones really give you some rope to hang yourself with, though :-)

Oh, and k3ninho told me that there is an initiative from the Ubuntu community called “100 Paper Cuts” to help fix small bugs like those I was complaining about. I hope this leads to an improved user experience, and I’d really like to see some of those changes propagated both upstream to Debian and upstream to the KDE folk.

During my install of Kubuntu + KDE, I felt that plasma was crashing more than Windows Explorer — it felt like the days when I was running Windows ME, and the shell would crash, and the system would restart it. Repeatedly. It’s exactly what seemed to happen with plasma. I’m not sure if it was something I screwed up during configuration (presumably so), but KDE was far too complicated for me to try and debug. It might also have been a result of me running my system within a fairly constrained virtual machine environment – the system only gets 768MB of RAM and no access to the actual graphics processor unit (since it’s virtualized).

Read Full Post »

Recently there was a thread on the Google Summer of Code students’ list discussing gender dynamics in open source, but more broadly, interactions between those of different genders (mainly the discussion was simplified to be a discussion of sexes, which I think demonstrates the lack of understanding of the difference between gender and sex. But I suppose that’s a blog post for another day).

It was noted that many of the women on the list have blog addresses and other details that quickly self-identify the authors as female. There was discussion about whether this is a good thing or not, and the possible reasons behind it.

Here is what I wrote:

I think what you mention about yourself shows the world what you think about yourself, and what you consider yourself.

If first and foremost you associate your identity with being female (or male) or straight (or not)… then I guess that’s your prerogative.

But I, for one, am not /just/ an Asian male. I’m not just a Computer Science student. I’m not just a coder. I’m not just an Engineering student. I’m not just 20-years old. I’m not just a blogger. I’m not just an Open Source contributor. I’m not just an advocate of strange and often unpopular ideas.

I am a human being, with many dimensions. And I don’t try to simplify it by putting myself in a box and categorizing myself as anything.

I think that the key is just to understand everyone for who they are, and part of that is being somewhat ambiguous. As Leslie [Hawthorne] somewhat alluded to, it’s about managing people’s preconceptions about you.

I do not actively try to hide that I am male, or that I am Asian (you might guess that from my last name). There are all sorts of preconceptions people might have about things, and there are lots of -isms we should seek to avoid. (I’m Asian – maybe that means I’m a bad driver, and that I can’t pronounce Rs. I’m male – maybe I’m violent. I’m in Computer Science, presumably that means I play Dungeons & Dragons with my classmates on the weekends. I’m in Engineering, maybe that means I’m sexist.)

The reality is: none of these things should matter, nor should they define you.

Just be yourself. You show to the world what you consider relevant about yourself.

And for what it’s worth, I found out the other day that someone I respect and admire in the open source community is a teenager. Somewhere around 15 years old. It’s impressive, really. I look up to him, because he’s a really smart guy. But that wasn’t something he brought up right away; his nickname wasn’t “smartdude15” or anything
like that. That’s the magic of open source, and the Internet — I judged him purely on his knowledge. And once I did find out, I thought to myself… Wow, would I have thought the same thing of him if I knew his age right away? Would I have even given him a chance, or would I just dismiss everything he said as something an immature teenager might say?

I think along with sexism there are tons of other issues to worry about, like racism (consider how difficult it is in some cultures, and even in Western culture, to be really accepted if you are gay, lesbian, transgender, bisexual, two-spirited, asexual, intersex…) In fact, being gay was considered a disease until relatively recently.

I’m glad for all the progress women have made in the past several decades. Not everyone has reached a point where they are accepted in mainstream society, and not everyone feels comfortable announcing certain details about themselves.

If *all* you are is a woman in a male-dominated world, then I feel sorry for you. I truly, truly do. Because none of the women I respect and admire are that. They are, first, talented Engineers, Scientists and Programmers, who are only incidentally female. Being female isn’t something that really identifies them any more than the colour of their skin, hair or eyes. No, no, they are talented, and that is, in the end, all I care about, and that is one reason I am grateful for Open Source — because you oftentimes don’t meet the people you are working with all the time in real life, so you cannot judge them on anything other than their ability.

Read Full Post »

For my Google Summer of Code project, I have been working with PerlQt4 bindings, which requires that I have Qt4 installed. While this is technically possible under a Win32 environment. Lots of people in the free software community vehemently oppose Windows, but while it has its flaws, I think overall the hardware support is still much better than Linux. True, this is because of Microsoft’s shady business practices, and because many companies keep their driver source code closed. I’m still using Windows XP Professional and quite happy with it, stability-wise and feature-wise.

As an Engineer, many applications we use on a regular basis are simply not available on Linux. They’re simply not replaceable with the current state of open source software, though there is some great stuff out there. Nonetheless, we’re still far from a point where engineers in general can switch to Linux — the application support is as important to an operating system as the kernel. Linux would be nothing without GNU’s binutils, for example.

I tried to install Debian first, as this is an environment I’m very familiar with. I use Debian on my development server, and it has worked wonders there. But everything I do on that server is command-line stuff. When trying to install a desktop environment, I followed the KDE Configuration Wizard, which isn’t too bad, but it expects an Internet connection throughout the process. The problem was that I didn’t have enough Ethernet cables to have both the desktop computer and my laptop plugged in at the same time, even though I had a wireless router set up, which meant I had to unplug the computer while updating packages, etc. Some of the updates took quite a bit of time, which was inconvenient for everyone else.

I eventually got the system to install, and told tasksel to set up a desktop environment. It was installing stuff, I typed ‘apt-get install kde’ and assumed everything would Just Work. After installing a whole bunch of stuff (which included a local install of mysqld, on a desktop machine?! — turns out it was due to one of KDE’s recommended packages, it starts with an A, I forget which). Anyway, then the environment didn’t “just work” as I had expected. Upon booting up my system, it just dropped me to a command line prompt. Fine, I thought, I’ll just use startx. But that was broken too. So after another few hours of fiddling I just gave up altogether.

While trying Ubuntu (the last time I had done so was probably in version 7 or so), I downloaded a recent image of Kubuntu 9.04, the Ubuntu flavour using KDE as a default desktop environment. It’s surprising that there has been lots of progress in Ubuntu and Linux in general. I have found that driver support is much better than it used to be, as it now detects my network card – a Broadcom 43xx chip – and does everything it needs to do. For the most part, my operating system “Just Works.” Great. This looks like something I might be able to slowly transition toward, completely replacing Windows except inside WINE or a Virtual Machine container.

Has Debian and Ubuntu made lots of progress? Sure. I can definitely see that Ubuntu is geared a lot more to the average user, while Debian provides bleeding-edge features to the power user. Unfortunately, despite being involved in packaging Perl modules for Debian, I fall into the former category. I’d really just like my desktop system to just work. Oh, and dual monitor support out-of-the-box would be nice too — I hear the new KDE and Gnome support this.

One thing Windows handles rather well is changing hardware profiles – when my computer is connected to its docking station, a ton of peripherals are attached. When I undock, they’re gone. Windows handles this rather gracefully. In Kubuntu, I got lots of notification boxes repeatedly telling me that eth2 was disconnected, etc. This sort of thing is undecipherable for the average user, so I’d really just like for these operating systems to be more human-friendly before they are ready for prime time on the desktop.

Read Full Post »

While working on my Google Summer of Code project today, I came across a bug that pretty much halted my productivity for the day.

Early on in my project, I decided that working with Unicode is hard, among other things. Since I was restricted to using C, I had to find a way to easily manipulate Unicode stuff, and I came across GLib (I’m not even entirely sure how, I think I just remember other projects using it, and decided to look it up.)

Not only did it have Unicode handling stuff, it also provides a bunch of convenient things like a linked list implementation, memory allocation, etc. All in a way intended to be cross-platform compatible, since this is the thing that’s used to power Gtk.

I’m not entirely sure how it differs from Apache’s Portable Runtime (APR); maybe it’s even a Not Invented Here syndrome. In any case, I, not suffering from that particular affliction, decided to be lazy and re-use existing code.

For some reason, GLib’s g_slice_alloc() call was failing. For those of you that don’t know what this does, it is similar to malloc() in standard C – it allocates a chunk of memory and returns it to you, so that you can make use of dynamic memory allocation, rather than everything just being auto variables. In particular, it means you can be flexible and allocate as much or as little memory as you need.

So I spent the entire day trying to figure out why my program was segfaulting. Looking at the output of gdb (the GNU Debugger), the backtrace showed that it was crashing at the allocation statement. No way, I thought, that test is so well-tested, it must be a problem with the way I’m using it.

I changed the code to use malloc() instead of g_slice_alloc(), and the program started crashing right away, rather than after four or five executions with g_slice_alloc(). Not exactly useful for debugging.

I mentioned my frustration with C on the Debian Perl Packager Group channel, and a friend from the group, Ryan Niebur took a look at the code (accessible via a public repository). After a bit of tinkering, he determined that the problem was that I was using g_slice_alloc instead of g_slice_alloc0, which automatically zeroes memory before returning it.

It stopped the crashing and my program works as intended. I’m still left totally puzzled as to why this bug was happening, and I’m not sure if malloc isn’t supposed to be used with structs, or some other limitation like that.

But thanks the magic of open source and social coding/debugging, the expertise of many contribute to the success of a single project. It’s such a beautiful thing.

Update: There were a lot of questions and comments, mainly relating to the fact that malloc and friends return chunks of memory that may not have been zeroed.

Indeed, this was the first thing I considered, but the line it happened to be crashing on was a line that pretty much just did g_slice_alloc, rather than any of the statements after that.

For those that are curious, all of the code is visible in the public repository.

I do realize that the fixes that have been made are pretty much temporary and that they are probably just masking a bigger problem. However, I’m at a loss for the issue is. Hopefully the magic of open source will work for me again, and one of the many people who have commented will discover the issue.

Read Full Post »

They say that premature optimization is the root of all evil, and they are right. Most likely, this code performs as well as, or potentially even worse than, checking that strcmp(s1, s2) == 0.

Nonetheless, while working on my Google Summer of Code project, I needed to test that two strings are equal while ignoring the case. As a result I wrote a simple algorithm called strieq(s1,s2) which simply returns 1 if the strings are equal, and 0 otherwise. This differs from strcmp because strcmp provides the stuff necessary for an actual lexicographical comparison, which my particular application didn’t require.

So here I publish my code for your viewing pleasure. Possibly you will be able to use it, and while I’ve tested it and tried to make it safe, I’m not an extremely experienced C coder, so I may have missed something. If there is a bug, please report it (via e-mail or in the comments section) so that I can update it.

int strieq(const char *, const char *);
static inline char lower(char);

/* strieq(const char *str1, const char *str2)
 *
 * This function takes two ASCII strings and compares them for equality in a
 * case-insensitive manner. It doesn't work with Unicode strings -- that is,
 * it will only fix case for normal ASCII strings. Everything else is passed
 * through as-is.
 */
int strieq(const char *str1, const char *str2)
{
  while (TRUE)
  {
    if ((*str1 == '' && *str2 != '') || (*str1 != '' && *str2 == ''))
      break;

    /* From the above test we know that they are either both non-null or both
     * strings are at the end. The latter case means we have a match.
     */
    else if (*str1 == '')
      return TRUE;

    /* Check if the lowercased version of the characters match */
    if (lower(*str1) != lower(*str2))
      break;

    str1++; str2++;
  }
  return FALSE;
}

/* lower(char ch)
 *
 * This function takes a single alphabetic ASCII character and makes it
 * lowercase, useful for comparison.
 */
static inline char lower(char ch)
{
  /* Check that ch is between A and Z */
  if (ch >= 'A' && ch <= 'Z')
    return (ch + ('a' - 'A'));
  /* Otherwise return it as-is */
  return ch;
}

In case this is an issue:

I, the copyright holder of the above snippet for stricmp and lower, hereby release the entire contents therein into the public domain. This applies worldwide, to the extent that it is permissible by law.

In case this is not legally possible, I grant any entity the right to use this work for any purpose, without any conditions, unless such conditions are required by law.

Update: Small fixes to documentation and code; WordPress must have stripped characters when I first inserted the code.

Read Full Post »

The first part of my Google Summer of Code project involves the creation of a library to handle Debian package control files, which are, upon closer inspection of the Debian Policy Manual, actually encoded as UTF-8 (8-bit Unicode).

Initially, the files just “looked” like ASCII (a rather common issue with a downward-compatible system like UTF-8). You see, UTF-8 is, by design, indistinguishable from ASCII unless high-order characters are used – this is so that files using only ASCII characters can still be interpreted as UTF-8.

All of this meant that there is the possibility of “wide characters” – that is, characters that require multiple bytes to render, such as those in other languages. This means that using C would become a bit tedious, as you have to handle these cases.

I had read about the GNOME Project’s GLib but not looked at it in any depth until now. Much to my surprise, I discovered that it is an entire framework of portable C code, providing I/O manipulation, string handling, common data structures like trees and hashes, among other things. These functions all provide a Unicode-safe system too, which are all manipulated internally in UTF-32 and written back out in UTF-8. I’m all about code reuse and, being lazy and not completely understanding Unicode in-depth (after all, there is the common statement that “Internationalisation is hard”), I decided that using GLib was the best way to go.

The unfortunate side effect of this is a bit of wasted space – even if all the characters are 8-bits wide, 32-bits will be required to store them, meaning 24-bits of wasted space per character. An 8kb ASCII file is roughly 8,000 characters, meaning 192kb of space are wasted for this file, which could very well be a moderately complex Debian control file. All in all, it’s not a big deal and can be converted to UTF-8 later if desired.

Admittedly, the documentation available online in the GNOME Library – the GLib Reference Manual – leaves something to be desired. In particular, each function is explained in terms of its parameters, input and output, but does not provide a trivial example of its use. Some of the functions are unclear on how they work — when manipulating strings in GLib 2.20, the documentation describes functions with a signature like:

GString* g_string_append_len(GString *string, const gchar *val,
gssize len);

In this case, it’s unclear what the returned GString is useful for. Generally, C functions like strcat (don’t use that, by the way, use strlcat instead) will update strings in-place rather than returning pointers to them.

On the other hand, without knowing the intentions of the developers, it might be designed for portability to create bindings in languages that do not have the notion of returning through a pointer, such as Perl. However, in my opinion, the particulars of that should depend on the implementation of the bindings, as in, the XS glue code should have provisions for this.

Update: From reading the body of other functions, it seems that the return value is used to allow nested calls, like: g_string_ascii_down(g_string_append_len(…)). It’s sort of neat, really; shows some design foresight in the library.

Read Full Post »

Older Posts »