Here is our status at the end of the summer 2024.

- LupBook was my personal focus of the summer. Among other things, I did a lot more code refactoring and created a new navigation system (inspired by Jillian Lim’s prototype). Jhaydine Bandola, a summer intern via the UC LEADS program, also implemented a few improvements to some of our interactive activities. Finally, we have a project website at https://luplab.gitlab.io/lupbook/home/ where one can test out our latest build.
- After a year long hiatus, our LupIO project has made a massive leap forward thanks to two summer interns from the CITRIS program: Kushagra Tiwari and Shengmin Liu. They implemented almost all of the LupIO devices in SystemVerilog and integrated them in the CVA6 platform. We know have an almost complete RISC-V based and LupIO-equipped hardware system running on FPGA!
- Noah Krim finished a first implementation of
`libvrv`

, our independent RISC-V emulation library, which is the core of our VRV (Virtual RISC-V) framework. Noah has now graduated so I took over the development.

Jhaydine is a 2nd-year computer science undergrad and is part of the UC LEADS program. She implemented some new features in LupBook.

Kushagra and Shengmin are both 3rd-year electrical engineering undergrads and participated in the CITRIS Workforce Innovation program. They implemented most of the LupIO devices in Verilog.

]]>Here is our status at the end of the spring quarter 2024.

- Noah Krim and I have started our reimplementation of VRV (Virtual
RISC-V). We decided to extract the emulation
engine to an independent library (
`libvrv`

), which Noah has started implementing. This library provides a public API which is used by various clients. I rewrote the command-line client (`vrv-cli`

). - LupBook has made some good progress. Arnav Rastogi and Russell Umboh finished implementing a new “Horizontal Parsons” component. Jillian Lim implemented a prototype of a new navigation system. I undertook the gruelling task of refactoring our codebase since many of our components were sharing a lot of code.
- Niharika Misal and Saili Karkare wrote draft papers about our current CS study looking at the perception of success and actual success between native students and transfer students.

Then the pandemic happened and as we continued studying these gender differences for the same two core CS courses, which were now taught remotely, we noticed a dramatic change. For the first time, male and female students were participating at the same rate on the online class forum.

To understand what could have be driving changes in such participation behavior, we ran a survey with students enrolled in some of these online classes. While we found that students of both genders tended to compare themselves to their peers less when classes were online, we also found that this trend was much more accentuated for females than males. This data suggested that observed female participation habits in typical in-person classes were not inherent gender differences, but rather, a product of the environment.

Despite our best efforts, this study was rejected twice from top CS Education conferences (ACM SIGCSE TS and ACM ITiCSE) in 2022. Unfortunately, it was never resubmitted as Maddii (the main author) graduated at the end of 2022, and the rationale for studying the effects of the pandemic was beginning to diminish. However, I’m happy to announce that our paper is now available on arXiv and will hopefully still be able to reach its intended audience.

]]>Three students from LupLab have presented their work on
LupBook at the *35th Annual UC Davis
Undergraduate Research, Scholarship & Creative Activities
Conference* last week. Jillian Lim
presented her work on adding a navigational sidebar and improving the UI/UX,
while Arnav Rastogi and Russell Umboh presented the new interactive activities
that they developed since last year.

Here is our status at the end of the winter quarter 2024.

- Noah Krim and I are working on a big overhaul of VRV (Virtual RISC-V). Our first version was a fairly straightforward port of SPIM to RISC-V. In the second version, we almost reimplementing everything from scratch, and we’re making the emulation engine a separate project. The goal is to be able to eventually build an online interface for this emulator.
- LupBook has seen a couple of merge requests. We now have a brand new “matching items” interactive component! Arnav Rastogi and Russell Umboh are working on an “horizontal parsons” interactive component. Jillian Lim is polishing her merge request for the inclusion of a navigation sidebar. Finally, Aiman made very good progress on implementing the saving/importing/exporting features.
- Niharika Misal and Saili Karkare have completed most of the data analysis for our current CS study looking at the perception of success and actual success between native students and transfer students. We are now thinking of writing an academic paper.

Students often ask me for suggestions on how to improve their programming skills during their spare time. I typically mention personal projects that they could start (picking a project for which they genuinely have a dire need), open-source projects that they use and to which they could contribute, or any sort of hackathons they can enroll in.

Now, finding ideas of projects to work on can still be daunting so I thought I’d share my own list in case some students are looking for useful project ideas!

Below is a list of projects that I’m personally interested in, but that I unfortunately don’t have much time to work on. So please, feel free to implement any of those and let me know!

I run Arch Linux and the KDE desktop manager on my laptop, so most of these projects are related to this environment.

*Note that these are _not_ academic projects for which I could officially
supervise students and give research credits.*

When I give a lecture, it would be great to have real-time captioning of my voice so that students could have access to subtitles as well. A lot of software have started doing that (Powerpoint, Zoom) but it’s only provided as part of their own products, not as a general service.

Here, the idea would be to make the real-time automatic captioning into some independent software so that it works on any Linux distribution. For example, it was thinking that the rendering could be similar to that of screenkey, a popular screencast tool to display keystrokes.

Services such as Github Gist already allow people to embed pieces of code on their blog or in online articles. However, none of the existing widgets showing code allow visitors to discuss the code itself. The only way for visitors to leave a comment is often via a separate commenting section (e.g., using services like Disqus at the bottom of an article) which is clunky to comment on code.

The idea here would be to develop the equivalent of Disqus both for showing snippets of code and also commenting them (like Soundcloud allows to comment on music at specific timestamps during a song).

I try my best to keep tabs on my work schedule (e.g., I maintain a daily log of what I’ve done) but sometimes it’s hard to fully comprehend how much time I spend on each task.

Since I typically separate my different types of activities into virtual desktops (e.g., one desktop for emails, one for teaching, one for research, etc.), it shouldn’t be too complicated to track the time spent on each virtual desktop.

Unfortunately, I haven’t found a good piece of software that can do that effectively and provide me with daily/weekly/monthly reports.

I recently bought a “Chromecast with Google TV” so that I could stream videos from my laptop to my projector wirelessly, without having to use a bulky HDMI cable. However, I encountered two types of issues:

- Players such as VLC could cast the video+audio just fine but it wouldn’t support casting subtitles, so that’s a no go.
- Other players have support for subtitles, such as SMplayer or gnomecast, but then the audio often doesn’t work.

Now, the audio issue seems to come audio codecs that my bluetooth sound bar doesn’t support. For instance, if the audio is encoded in AC3, then it is sent to Chromecast from my computer, the Chromecast acts as a passthrough and retransmits it directly to my sound bar which can’t decode it.

For now, I’ve come up with a hacky patch in gnomecast which forces the audio to always be transcoded to mp3 first.

But ideally, if I had more time, I would 1/ transform the transcoding into an option offered in the interface, and 2/ potentially reimplement gnomecast to make it more modular and to use QT instead of GTK so that it blends better in my desktop manager.

]]>Here is our status at the end of the fall quarter 2023.

- Noah Krim and I completed the first stable version of VRV (Virtual RISC-V), our port of SPIM to RISC-V. VRV was successfully used by 170+ students in ECS 50 (our intro to computer organization course) during the fall quarter.
- The work on LupBook continues making good progress. Arnav Rastogi and Russell Umboh are currently developing a new “matching items” interactive component. Jillian Lim is improving the navigation in the interactive textbook. Finally, Aiman Fatima is implementing some saving/importing/exporting features so that the reader’s work is not lost when the page is refreshed.
- Niharika Misal and Saili Karkare, who work on our current CS study looking at the perception of success and actual success between native students and transfer students, successfully completed the inaugural UR2PhD program.

This quarter (FQ23), I am teaching our lower-div course on computer organization and assembly language (ECS 50). The first couple of weeks are dedicated to how various types of data are represented in the computer. For instance, we study unsigned integers, signed integers, characters, and floating-point numbers.

In one of my slides about floating-point numbers, I briefly mention that the exponent is stored as a “biased value” but every time in lecture, I struggle to decide how much details I should provide. If I don’t give much detail, then students can get confused as to why the exponent’s value is simply not stored directly as a signed integer. On the other hand, if I start diving into the reasons for storing the exponent as a biased value, I could probably spend a whole lecture on it!

So I decided to explain the whole reasoning behind the exponent bias here (spoiler: it has to do with comparisons between floating-point numbers). This way I’ll be able to refer students to this article next time I teach this class!

Let’s start from the beginning with the representation of unsigned integers. Unsigned integers are integers that can hold a null or positive value.

The bits composing an unsigned integers directly represent their magnitude, by
adding the corresponding powers of two. So, for example, when considering 8-bit
word `01101000`

as an unsigned integer, it gets interpreted as
$0*2^7 + 1*2^6 + 1*2^5 + 0*2^4 + 1*2^3 + 0*2^2 + 0*2^1 + 0*2^0 = 64 + 32 + 8 = 104$
in decimal.

Comparing the magnitudes of two unsigned words is conceptually straightforward.
Starting from the most significant bit (MSB), the first integer that has a `1`

that the other doesn’t is the greater number. Here is an example between two
8-bit words that contain unsigned integers:

In practice, hardware comparators are able to compare all the bits of two N-bit
words at the same time (say, words `A`

and `B`

), and output whether `A < B`

, or
`A > B`

, or `A == B`

.

Here is an example of a 4-bit comparator:

Signed integers are integers that can hold a negative, null, or positive value.

The most common approach to encode signed integers is called *two’s complement*.

In this approach, the word’s MSB acts as a sign bit. If a signed word has its
MSB set to `0`

, then it contains a positive value which can be decoded by simply
determining the magnitude represented by the remaining bits (just like the
unsigned interpretation). In that case, the sign bit itself carries no value
(i.e., it is not associated to a power of two), which makes sense since it’s
worth `0`

anyway.

However, if the sign bit is `1`

, then it is meant to represent a base value of
$-2^{w-1}$ where $w$ is the word size. The remaining $w-1$ bits are then
interpreted as an unsigned integer, and represent a (positive) offset from the
negative base value set by the sign bit.

Comparing two signed words is slightly more involved than with unsigned integers.

- If the two signed integers have opposite sign bits, then the integer with the
positive sign (i.e., sign bit of
`0`

) is the greatest. We don’t even need to compare any other bits. - If the two signed integer have the same sign bit, then we have to compare
their magnitude (expressed by the $w-1$ remaining bits).
- For positive signed integers, it’s straightforward since the magnitude directly encodes the value.
- For negative signed integers, it actually just works too, since the integer with the bigger (positive) magnitude will be the one that is the further away from the negative base value, which means the closest to 0, which means the greater number! See example below.
- In either case, we can use the same type of comparator as for unsigned integers, but only on the $w-1$ lower bits.

Here is an example of comparing two negative numbers:

Floating point numbers are typically represented using the IEEE 754 standard, in their binary normalized scientific notation: $\pm~M*2^E$.

For example, number $+1101.1001$, which represents $+ 2^3 + 2^2 + 2^0 + 2^{-1} + 2^{-4} = +13.5625$ in decimal, can be rewritten as $+1.1011001*2^3$ in its binary normalized scientific notation. The sign is positive ($+$), $M$ is known as the mantissa and only has one digit before the binary point ($1.1011001$), and the exponent $E$ (worth $3$ here) is the power of two multiplying the mantissa to adjust the binary point.

This is akin to representing rational decimal numbers using the normalized scientific notation. For instance, number $-4,321.768$ can be expressed as $-4.321,768 * 10^3$.

In single precision (i.e., `float`

in C), a floating point number is encoded in
a 32-bit word. In this format, the sign is expressed by the top bit, the `exp`

field on `8`

bits is meant to represent the exponent $E$, and the `frac`

field
on `23`

bits represents the fractional part of $M$.

If $E$ is negative, we can represent very tiny numbers, such as
$1.0*2^{-42}=0.000,000,000,000,227,373,675,443,232,1$. If $E$ is positive, we
can represent very large numbers, such as $1.0*2^{42}=4,398,046,511,104$. Since
`exp`

is on 8 bits, it gives 256 possible combinations that we can use to
represent a contiguous range of negative and positive numbers centered around
$0$. For example, if we assumed some standard two’s complement method, it could
represent range $[-128,127]$ (it’s just an example though, because it is not
what `exp`

represents).

Given a certain value $E$, all the combinations of `frac`

give $2^{23}$
equispaced values in the range $[2^E,2^{E+1})$. If $E$ is incremented by one, it
opens a new interval of $2^{23}$ equispaced values, which has no overlap with
the previous interval.

In terms of comparison, we can then already observe a few things. First, if two
numbers have opposite signs, then the positive number is automatically the
greatest. Otherwise, we have to then consider the signed value of $E$. That is,
between two floating-point numbers, the one with the greatest $E$ value is the
greater number. Finally, if both numbers have the same sign and the same $E$,
then the number with the greatest $M$, that is the greatest `frac`

field, is the
greater number.

$E$ is actually not directly stored in `exp`

as a signed integer, as
hypothesised above. Instead, it is stored as a “biased value”. This means that
$E$, which is meant to belong to a contiguous range of 256 negative and positive
numbers centered around 0, is offset by a bias in order to be stored in `exp`

as
a strictly positive value (i.e., an unsigned integer).

The bias for single-precision floating-point numbers (`float`

) is set to $127$.

So for instance, if, for a given float, the `exp`

field contains combination
`00101010`

(decoded as an unsigned integer as $42$), it would in fact represent
an exponent $E = 42 - 127 = -85$, which is negative. The float would be a rather
small number.

In this other direction, if we tried to represent a large floating point number
such as $1.frac * 2^{42}$ (we don’t care about the mantissa here), then $E$
would be equal to $42$, but it would be stored in field `exp`

as $42 + 127 =
169$ after being offset by the bias.

Why are we using this biased representation for $E$? Well, it has to do with being able to compare floating-point numbers very quickly and efficiently!

Let’s start by considering the case where $E$ was stored in `exp`

directly as a
signed integer, using the two’s complement method. In that case, the range of
$E$ would be $[-128, 127]$. If we had to compare two floats, it would be fairly
complex because we’d have to look at the sign bit, and then we’d have to apply
the comparison method for signed integers on the `exp`

field:

- If the two floats have opposite sign bits, then the float with the
positive sign (i.e., sign bit of
`0`

) is the greatest. We don’t even need to compare any other bits. - If the two floats have the same sign bit, then we have to compare
their
`exp`

field as signed integers (same technique as explained above).- If the two signed exponents have opposite sign bits, then the exponent
with the positive sign (i.e., sign bit of
`0`

) is the greatest. - If the two signed exponents have the same sign bit, then we have to compare
their magnitude (expressed by the $7$ remaining bits).
- For positive signed integers, it’s straightforward since the magnitude directly encodes the value.
- For negative signed integers, it actually just works too, since the integer with the bigger (positive) magnitude will be the one that is the further away from the negative base value, which means the closest to 0, which means the greater number!
- In either case, we can use the same type of comparator as for unsigned integers, but only on the $7$ lower bits.

- If the two signed exponents have opposite sign bits, then the exponent
with the positive sign (i.e., sign bit of
- If the two floats have the same exponent, then we have to compare their
`frac`

field as unsigned integers. Whichever float has the largest magnitude is the greater number.

As you can see, we have to nest the signed comparison of the exponent, after having considered the sign of the floats themselves. Doing this type of comparison would require a specific, and complicated type of comparator.

Now, if we store the exponent of a floating-point number as a biased value, it becomes an unsigned integer, which considerably simplifies the comparison of two floats. In that case, the comparison becomes:

- If the two floats have opposite sign bits, then the float with the positive sign is the greatest. We don’t even need to compare any other bits.
- If the two floats have the same sign bit, then we can compare all the
following bits as a magnitude.
- The number with the bigger exponent will appear to have a bigger magnitude.
- If the two numbers have the same exponent, the fractional part of the mantissa will become relevant and whichever has the bigger fractional part will also appear to have a bigger magnitude.
- In either case, we can use the same type of comparator as for unsigned integers, on all the bits after the sign bit.

Hopefully, you’ve noticed that we can use the same type of comparator as for signed integers to compare floats, which makes things so much easier since we already have them in any standard processor!

By storing the exponent of a floating-point number as a biased value, we enable comparing floating-point numbers as if they were simple signed integers, that is using the same type of comparator.

]]>