The Syscall Showdown: CRuby writes files with 40% fewer syscalls than CPython?
We’ve released a new version of Cirron that can now trace syscalls and record performance counters for individual lines of Ruby code, just like it could already do for Python (more here and here). It makes it very easy to quickly inspect what’s happening in any section of your code and even assert what should be happening in tests, for example.
To put it through its paces I’ve compared what syscalls each language uses for several common patterns: File IO, generating random numbers, telling time and even just printing a string.
File IO
Let’s start with something surprising right away. Here are the snippets under investigation, simply writing a string to a file (I’ll be omitting the Cirron setup from the snippets later on):
# python
with Cirron.Tracer() as t:
with open("test.txt", "w") as f:
f.write("Hello, File I/O!")
# ruby
t = Cirron::tracer do
File.open("test.txt", "w") do |f|
f.write("Hello, File I/O!")
end
end
And here then, are the system calls each language makes. I’ve highlighted calls that aren’t made by the other language and added approximate time spent in each call on my system:
Python | Ruby |
---|---|
openat(AT_FDCWD, "test.txt", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 3 ~80µs | openat(AT_FDCWD, "test.txt", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 5 ~80µs |
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=0, ...}, AT_EMPTY_PATH) = 0 ~20µs | |
ioctl(3, TCGETS, 0x7ffe559ef8b0) = -1 ENOTTY ~20µs | ioctl(5, TCGETS, 0x7ffe2a8d9e90) = -1 ENOTTY ~20µs |
lseek(3, 0, SEEK_CUR) = 0 ~20µs | |
lseek(3, 0, SEEK_CUR) = 0 ~20µs | |
write(3, "Hello, File I/O!", 16) = 16 ~30µs | write(5, "Hello, File I/O!", 16) = 16 ~30µs |
close(3) = 0 ~80µs | close(5) = 0 ~80µs |
Both start off with openat to open test.txt
in the current directory.
Python then, unlike Ruby, uses newfstat to figure out what buffer size to use by looking at st_blksize
1.
Both languages then use ioctl to figure out whether the file descriptor is a TTY.
So far as expected, but then… Python decides to lseek to the start of the file twice. This doesn’t seem strictly necessary and a quick look through the cPython codebase didn’t reveal why this is happening. If you know, let me know!
Anyway: finally both languages use write to write our string and close to close the file descriptor.
How about reading a file?
# python
with open("test.txt", "r") as f:
content = f.read()
# ruby
content = File.read("test.txt")
Python | Ruby |
---|---|
openat(AT_FDCWD, "test.txt", O_RDONLY|O_CLOEXEC) = 3 ~30µs | openat(AT_FDCWD, "test.txt", O_RDONLY|O_CLOEXEC) = 5 ~30µs |
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=16, ...}, AT_EMPTY_PATH) = 0 ~20µs | |
ioctl(3, TCGETS, 0x7ffe559ef8b0) = -1 ENOTTY ~20µs | ioctl(5, TCGETS, 0x7ffe2a8d9fd0) = -1 ENOTTY ~20µs |
lseek(3, 0, SEEK_CUR) = 0 ~20µs | |
lseek(3, 0, SEEK_CUR) = 0 ~20µs | |
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=16, ...}, AT_EMPTY_PATH) = 0 ~20µs | newfstatat(5, "", {st_mode=S_IFREG|0644, st_size=16, ...}, AT_EMPTY_PATH) = 0 ~20µs |
lseek(5, 0, SEEK_CUR) = 0 ~20µs | |
read(3, "Hello, File I/O!", 17) = 16 ~20µs | read(5, "Hello, File I/O!", 16) = 16 ~20µs |
read(3, "", 1) = 0 ~20µs | read(5, "", 8192) = 0 ~20µs |
close(3) = 0 ~20µs | close(5) = 0 ~20µs |
Nothing terribly surprising here. Python still doing its double lseek and Ruby, probably not wanting to be left behind, adds an lseek of its own (also not strictly necessary, as far as I can tell). Note how openat here takes only ~30µs, but it was ~80µs when opening for with O_WRONLY|O_CREAT|O_TRUNC
.
Sneaky syscalls
Let’s look at generating random numbers and telling time:
# python
random_int = random.randint(1, 100)
# ruby
random_int = rand(1..100)
# python
current_time = time.time()
# ruby
current_time = Time.now
You might reasonably assume you’d need a system call here too, but not so. Since these operations are frequently used and not privileged, they use vDSO and turn into normal function calls, avoiding paying the price of a syscall. This makes them invisible to strace
but you’d see them pop up in ltrace
.
Printing
Here’s one I thought would be obvious, but turned out a little surprising too:
# python
print("Hello")
# ruby
puts "Hello"
Python | Ruby |
---|---|
write(1, "Hello\n") = 6 ~50-250µs | writev(1, [{iov_base=\"Hello\", iov_len=5}, {iov_base=\"\\n\", iov_len=1}], 2) = 6 ~50-250µs |
I expected both languages to just use write, but apparently some 6 years ago Ruby switched to using writev instead in Feature #14042: IO#puts: use writev if available. Before this, puts
would use an extra write to output a newline and hence wasn’t atomic.
Fin
I hope you found this interesting! If you’d find it useful if Cirron supported another language too, let me know!