Evolvis – Make it into a Project!

Valid UTF-8 but invalid XML

Another PSA: something surprising about XML.

As you might all know, XML must be valid UTF-8 (or UTF-16 (or another encoding supported by the parser, but one which yields valid Unicode codepoints when read and converted)). Some characters, such as the ampersand ‘&’, must be escaped (“&#38;” or “&#x26;”, although “&amp;” may also work, depending on the domain) or put into a CDATA section (“<![CDATA[&]]>”).

A bit surprisingly, a literal backspace character (ASCII 08h, Unicode U+0008) is not allowed in the text. I filed a bugreport against libxml2, asking it to please encode these characters.

A bit more research followed. Surprisingly, there are characters that are not valid in XML “documents” in any way, not even as entities or in CDATA sections. (xmlstarlet, by the way, errors out somewhat nicely for an unescaped literal or entity-escaped backspace, but behaves absolutely hilarious for a literal backspace in a CDATA section.) Basically, XML contains a whitelist for the following Unicode codepoints:

  • U+0009
  • U+000A
  • U+000D
  • U+0020‥U+D7FF
  • U+E000‥U+FFFD
  • U-00010000‥U-0010FFFF

Additionally, a certain number of codepoints is discouraged: U+007F‥U+0084 (IMHO wise), U+0086‥U+009F (also wise, but why allow U+0085?), U+FDD0‥U+FDEF (a bit surprisingly, but consistent with disallowing the backspace character), and the last two codepoints of every plane (U+FFFE and U+FFFF were already disallowed, but U-0001FFFE, U-0001FFFF, …, U-0010FFFF weren’t; this is extremely wise).

The suggestion seems to be to just strip these characters silently from the XML “document”.

I’m a bit miffed about this, as I don’t even use XML directly (I’m extending a PHP “webapplication” that is a SOAP client and talks to a Java™ SOAP-WS) and would expect this to preserve my strings, but, oh my. I’ve forwarded the suggestion to just strip them silently to the libxml2 maintainers in the aforementioned bug report, for now, and may even hack that myself (on customer-paid time). More robust than hacking the PHP thingy to strip them first, anyway – I’ve got no control over the XML after all.

Sharing this so that more people know that not all UTF-8 is valid in XML. Maybe it saves someone else some time. (Now wondering whether to address this in my xhtml_escape shell function. Probably should. Meh.)

Tip of the day: bind tomcat7 to loopback i/f only

We already edit /etc/tomcat7/server.xml after installing the tomcat7 Debian package, to get it to talk AJP instead of HTTP (so we can use libapache2-mod-jk to put it behind an Apache 2 httpd, which also terminates SSL):

We already comment out the block…

    <Connector port="8080" protocol="HTTP/1.1"  
               connectionTimeout="20000"
               URIEncoding="UTF-8"
               redirectPort="8443" />

… and remove the comment chars around the line…

    <Connector port="8009" protocol="AJP/1.3" redirectPort="8443" />

… so all we need to do is edit that line to make it look like…

    <Connector address="127.0.0.1" port="8009" protocol="AJP/1.3" redirectPort="8443" />

… and we’re all set.

(Your apache2 vhost needs a line

JkMount /?* ajp13_worker

and everything Just Works™ with the default configuration.)

Now, tomcat7 is only accessible from localhost (Legacy IP), and we don’t need to firewall the AJP (or HTTP/8080) port. Do make sure your Apache 2 access configuration works, though ☺

PSA: #shellshock still unfixed except in Debian unstable, testing, *buntu LTS

I just installed, for work, Hanno Böck’s bashcheck utility on our monitoring system, and watched all¹ systems go blue.

① All but two. One is not executing remote scripts from the monitoring for security reasons, the other is my desktop which runs Debian “sid” (unstable).

(Update, 2014-10-20: jessie, precise, trusty are also green now.)

This means that all those distributions still have unfixed #shellshock bugs.

  • lenny (with Md’s packages): bash (3.2-4.2) = 3.2.53(1)-release
    • CVE-2014-6277 (lcamtuf bug #1)
  • squeeze (LTS): bash (4.1-3+deb6u2) = 4.1.5(1)-release
    • CVE-2014-6277 (lcamtuf bug #1)
  • wheezy (stable-security): bash (4.2+dfsg-0.1+deb7u3) = 4.2.37(1)-release
    • CVE-2014-6277 (lcamtuf bug #1)
    • CVE-2014-6278 (lcamtuf bug #2)
  • jessie (testing): bash (4.3-10) = 4.3.27(1)-release
    • CVE-2014-6277 (lcamtuf bug #1)
    • CVE-2014-6278 (lcamtuf bug #2)
  • sid (unstable): bash (4.3-11) = 4.3.30(1)-release
    • none
  • CentOS 5.5: bash-3.2-24.el5 = 3.2.25(1)-release
    • extra-vulnerable (function import active)
    • CVE-2014-6271 (original shellshock)
    • CVE-2014-7169 (taviso bug)
    • CVE-2014-7186 (redir_stack bug)
    • CVE-2014-6277 (lcamtuf bug #1)
  • CentOS 5.6: bash-3.2-24.el5 = 3.2.25(1)-release
    • extra-vulnerable (function import active)
    • CVE-2014-6271 (original shellshock)
    • CVE-2014-7169 (taviso bug)
    • CVE-2014-7186 (redir_stack bug)
    • CVE-2014-6277 (lcamtuf bug #1)
  • CentOS 5.8: bash-3.2-33.el5_10.4 = 3.2.25(1)-release
    • CVE-2014-6277 (lcamtuf bug #1)
  • CentOS 5.9: bash-3.2-33.el5_10.4 = 3.2.25(1)-release
    • CVE-2014-6277 (lcamtuf bug #1)
  • CentOS 5.10: bash-3.2-33.el5_10.4 = 3.2.25(1)-release
    • CVE-2014-6277 (lcamtuf bug #1)
  • CentOS 6.4: bash-4.1.2-15.el6_5.2.x86_64 = 4.1.2(1)-release
    • CVE-2014-6277 (lcamtuf bug #1)
  • CentOS 6.5: bash-4.1.2-15.el6_5.2.x86_64 = 4.1.2(1)-release
    • CVE-2014-6277 (lcamtuf bug #1)
  • lucid (10.04): bash (4.1-2ubuntu3.4) = 4.1.5(1)-release
    • CVE-2014-6277 (lcamtuf bug #1)
  • precise (12.04): bash (4.2-2ubuntu2.5) = 4.2.25(1)-release
    • CVE-2014-6277 (lcamtuf bug #1)
    • CVE-2014-6278 (lcamtuf bug #2)
  • quantal (12.10): bash (4.2-5ubuntu1) = 4.2.37(1)-release
    • extra-vulnerable (function import active)
    • CVE-2014-6271 (original shellshock)
    • CVE-2014-7169 (taviso bug)
    • CVE-2014-7186 (redir_stack bug)
    • CVE-2014-6277 (lcamtuf bug #1)
    • CVE-2014-6278 (lcamtuf bug #2)
  • trusty (14.04): bash (4.3-7ubuntu1.4) = 4.3.11(1)-release
    • CVE-2014-6277 (lcamtuf bug #1)
    • CVE-2014-6278 (lcamtuf bug #2)

I don’t know if/when all distributions will have patched their packages ☹ but thought you’d want to know the hysteria isn’t over yet…

… however, I hope you were not stupid enough to follow the advice of this site which suggests you to download some random file over the ’net and execute it with superuser permissions, unchecked. (I think the Ruby people were the first to spread this extremely insecure, stupid and reprehensible technique.)

Updates:

  • rsc points out that CentOS only supports 5.«latest» and 6.«latest», and paying RHEL get 5.«x».«y» but only occasionally. We updated one of the two systems in question and shut the other down due to lack of use.
  • trusty (14.04): bash (4.3-7ubuntu1.5) = 4.3.11(1)-release
    • none
  • Yes, comments on this blog are disabled; mail t.glaser@tarent.de for feedback.
  • Since I was asked (twice): the namespace patches by Florian Weimer protect from most exploits. The bugs are, nevertheless, present:
    root@debian-wheezy:~ # env 'BASH_FUNC_foo()=() { x() { _; }; x() { _; } <<'"$(perl -e '{print "A"x1000}'); }" bash -c :
    Segmentation fault
    139|root@debian-wheezy:~ # dmesg | tail -1
    [3121102.362274] bash[1699]: segfault at dfdfdfdf ip 00000000f766df36 sp 00000000ffe90b34 error 4 in libc-2.13.so[f75ee000+15d000]
     
  • To one eMail sender: If you do not understand why a CGI or something else could invoke a shell, or what a segmentation fault trap is, do not bother me. Especially not in that tone.
  • To another eMail sender: Yes, quantal is end of life. It’s also upgraded. First and last time I ever used *buntu’s “do-release-upgrade”. Broke kernel and GRUB, and upgraded to saucy. I manually “apt-get –purge dist-upgrade”d to trusty (went surprisingly well).
  • precise (12.04): bash (4.2-2ubuntu2.6) = 4.2.25(1)-release
    • none
  • jessie (testing): bash (4.3-11) = 4.3.30(1)-release
    • none
  • Update 27.10.2014: Clearly, distributions are not fixing the lcamtuf bug series (Debian stable, CentOS), believing that the affix patch makes them invulnerable, while it just removes the most common/popular/exposed attack vector. Sad story.

Thanks to ↳ tarent for letting me do this work during $dayjob time!

git rebasing considered harmful

git rebase is problematic (from a version control system user point of view) because it rewrites history. We all knew that.

But did you know that git pull --rebase, commonly used before a git push, can also be harmful, destroy history, and surprise users in a negative way?

10:08⎜<Beuc:#fusionforge> Lo-lan-do, we must have committed at the same time :) First time I rebase and my commit disappear ;)
10:11⎜«Lo-lan-do:#fusionforge» So who won?
10:11⎜«Lo-lan-do:#fusionforge» Aha, I did :-)

This does fit git’s model of managing patches and tracking content, but is just irresponsible for a version control system. (Also, imagine incensed contributors whose commits just vanish.) So, danger, beware of using git rebasing when you use git as distributed version control system!

In a related way: merge commits are good. Especially when merging between, into or from feature branches. (A friend had his .gitconfig set up to default to rebasing… ugh.) So there should have been one place where you used rebase: to avoid merge commits when people work on the same repository at the same time (but, ideally, on different files). Those were mostly annoying. But, as you can see above, the alternatives are even worse…

iCalender and timezones

Okay. I just created three events at tomorrow 10:00 CEST (08:00 UTC), on three different accounts on the very same Debian Lenny machine. All three use nxclient to log into KDE 3, with Kontact/KDEPIM Version 1.2.9 (enterprise35 20131030.a834355).

Then, I looked at the event invitations (METHOD:REQUEST BEGIN:VEVENT) in a simple eMail program (alpine). What I got made me beg to differ:

$ fgrep -e DTSTAMP -e CREATED -e UID -e LAST-MODIFIED -e SUMMARY -e DTSTART -e DTEND s?

s2:DTSTAMP:20140812T114542Z
s2:CREATED:20140812T114541Z
s2:UID:libkcal-1345193151.365
s2:LAST-MODIFIED:20140812T114541Z
s2:SUMMARY:sk2
s2:DTSTART:20140813T080000Z
s2:DTEND:20140813T100000Z

s3:DTSTAMP:20140812T134551Z
s3:CREATED:20140812T134550Z
s3:UID:libkcal-579827798.725
s3:LAST-MODIFIED:20140812T134550Z
s3:SUMMARY:sk3
s3:DTSTART:20140813T100000Z
s3:DTEND:20140813T120000Z

s4:DTSTAMP:20140812T134559Z
s4:CREATED:20140812T134558Z
s4:UID:libkcal-743876151.470
s4:LAST-MODIFIED:20140812T134558Z
s4:SUMMARY:sk4
s4:DTSTART:20140813T100000Z
s4:DTEND:20140813T120000Z

I should mention that I created the events at roughly 13:45 CEST today (11:45 UTC), and the system timezone on the box as well as the “Time & Date” zone in the Kontact settings are all “Europe/Berlin”.

To add insult to injury: the calendar view on the accounts that created the event all do show it for tomorrow, 10:00 local time.

I cannot possibly imagine how this could go wrong, seeing as those are all on the same machine…

WTF is going on here?


Update: .kde/share/config/korganizerrc had TimeZoneId wrong. I had to change a different timezone (such as Europe/Bratislava), hit the OK button, confirm to “Keep Times”, then re-open the settings dialogue and change back to Europe/Berlin, for it to work.

Nicht Hardlink, sondern…?

TANSTA “softlink”. ITYM: symlink(7)

Actually, there’s “link” (also “hardlink”) and “symbolic link” (short “symlink”). (Oh, and reparse points, but let’s not get there, lest we mention *.lnk files…)

Inspired by a posting on the klibc mailing list.

SSD

Following the Wiki I put “discard” entries into my fstab(5) for swap (but not / as it’s suggested to use fstrim instead, and I had noatime for /boot and relatime for / already) and changed the scheduler. What wasn’t written there was to set vm.swappiness=0 in sysctl.d/local.conf *shrug* but it helps.

What they also didn’t mention:

mount -t tmpfs swap /var/cache/apt/archives

And in sid, which my dayjob-laptop is running, APT _finally_ creates the missing “partial/” subdirectory itself. Thanks!

I had the filesystems already created, so I changed the ext options with tune2fs:

tune2fs -E stride=1024,stripe_width=1024 /dev/sda2 # 1 KiB blocks: /boot
tune2fs -E stride=256,stripe_width=256 /dev/sda4 # 4 KiB blocks: /

I wonder how well that works. I also did _not_ manage to find out my device’s flash block size (search engine fodder: erase block size), so I assumed 1 MiB (also used for partition alignment already) as worst-case scenario. Input welcome on how to find *that* out.

Send Alt-SysRq-* to virt-manager guest using virsh

virsh send-key guestname KEY_LEFTALT KEY_SYSRQ KEY_H

This doesn’t work in virt-manager, but the virsh CLI tool is just fine.

PS: zerofree rocks! And can be installed in Grml 2011.05 just finely.

Fun with ssh pubkey auth

Okay, so imagine this: you just generated an SSH RSA key and threw its public part on system B into ~foo/.ssh/authorized_keys and its private part on system A into ~bar/.ssh/id_rsa but can’t login. Why?

Automated processes (Jenkins *cough*) often need you to ssh(1) manually once, to accept the remote host’s server key. Do that.

The id_rsa file on system A must be owned by the user bar and chmod 0600 or 0400 (similarily, the .ssh directory has strict permission checks, and everything in the path until there). Check those.

And, the most surprising one of the day: if there’s an id_rsa.pub it will be used for offering a key to the remote host (B) even if it does not match the secret key. Deleting A:~bar/.ssh/id_rsa.pub apparently makes OpenSSH generate the public part from the secret key each time (or just put the correct pubkey there), but if one’s there, it seems to like to use them. (That was the only part of this post that was news to even me, of course ☺)

And, as bottom line: hello to Planet Debian from “mirabilos at work”, too. I’ll occasionally tag posts so they show up here, if I think they’re of interest, since I’m doing Debian work at the dayjob, too.

How to find out when to a git repository was last committed?

git log -n 1 --all --full-history --pretty=format:'%cD'

This should™ scan all branches, take the chronologically last commit and output its committer date. Still doesn’t take into account git-receive-pack times, but we can just look at the mtime of the projectname-commits@lists.forgename mailing list for that.