<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

  <title><![CDATA[All things sysadmin]]></title>
  <link href="http://northernmost.org/blog//atom.xml" rel="self"/>
  <link href="http://northernmost.org/blog//"/>
  <updated>2017-02-16T21:34:08+00:00</updated>
  <id>http://northernmost.org/blog//</id>
  <author>
    <name><![CDATA[Erik Ljungstrom]]></name>
    
  </author>
  <generator uri="http://octopress.org/">Octopress</generator>

  
  <entry>
    <title type="html"><![CDATA[Sysadmin 101: Troubleshooting]]></title>
    <link href="http://northernmost.org/blog//troubleshooting-101/index.html"/>
    <updated>2017-02-16T22:27:31+00:00</updated>
    <id>http://northernmost.org/blog//troubleshooting-101/troubleshooting-101</id>
    <content type="html"><![CDATA[<p>I typically keep this blog strictly technical, keeping observations, opinions
and the like to a minimum. But this, and the next few posts will be about
basics and fundamentals for starting out in system administration/SRE/system engineer/sysops/devops-ops
(whatever you want to call yourself) roles more generally.<br/>
Bear with me!</p>

<p><em>“My web site is slow”</em></p>

<p>I just picked the type of issue for this article at random, this can be
applied to pretty much any sysadmin related troubleshooting.
It’s not about showing off the cleverest oneliners to find the most
information. It’s also not an exhaustive, step-by-step “flowchart” with the
word &ldquo;profit&rdquo; in the last box.
It’s about general approach, by means of a few examples.<br/>
The example scenarios are solely for illustrative purposes. They sometimes
have a basis in assumptions that doesn&rsquo;t apply to all cases all of the time, and I&rsquo;m
positive many readers will go <em>&ldquo;oh, but I think you will find&hellip;&rdquo;</em> at some point.<br/>
But that would be missing the point.</p>

<p>Having worked in support, or within a support organization for over a decade,
there is one thing that strikes me time and time again and that made me write
this;<br/>
<strong>The instinctive reaction many techs have when facing a problem, is
to start throwing potential solutions at it.</strong></p>

<!--more-->


<p><em>“My website is slow”</em></p>

<ul>
<li>I’m going to try upping <code>MaxClients/MaxRequestWorkers/worker_connections</code></li>
<li>I’m going to try to increase <code>innodb_buffer_pool_size/effective_cache_size</code></li>
<li>I’m going to try to enable <code>mod_gzip</code> (true story, sadly)</li>
</ul>


<p><em>“I saw this issue once, and then it was because X. So I’m going to try to fix X
again, it might work”</em>.</p>

<p>This wastes a lot of time, and leads you down a wild goose chase. In the dark. Wearing greased mittens.<br/>
InnoDB’s buffer pool may well be at 100% utilization, but that’s just because
there are remnants of a large one-off report someone ran a while back in there.
If there are no evictions, you’ve just wasted time.</p>

<h4>Quick side-bar before we start</h4>

<p>At this point, I should mention that while it’s equally applicable to many
roles, I’m writing this from a general support system adminstrator’s point of
view. In a mature, in-house organization or when working with larger, fully managed or
&ldquo;enterprise” customers, you’ll typically have everything instrumented,
measured, graphed, thresheld (not even word) and alerted on. Then your approach
will often be rather different. We’re going in blind here.</p>

<p>If you don’t have that sort of thing at your disposal;</p>

<h3>Clarify and First look</h3>

<p>Establish what the issue actually is. “Slow” can take many forms. Is it time to
first byte? That’s a whole different class of problem from poor Javascript
loading and pulling down 15 MB of static assets on each page load.
Is it slow, or just slower than it usually is? Two very different plans of
attack!</p>

<p>Make sure you know what the issue reported/experienced actually is before you
go off and do something. Finding the source of the problem is often difficult
enough, without also having to find the problem itself.<br/>
That is the sysadmin equivalent of bringing a knife to a gunfight.</p>

<h3>Low hanging fruit / gimmies</h3>

<p>You are allowed to look for a few usual suspects when you first log in to a
suspect server. In fact, you should! I tend to fire off a smattering of commands
whenever I log in to a server to just very quickly check a few things; Are we
swapping (<code>free/vmstat</code>), are the disks busy (<code>top/iostat/iotop</code>), are we dropping
packets (<code>netstat/proc/net/dev</code>), is there an undue amount of connections in an
undue state (<code>netstat</code>), is something hogging the CPUs (<code>top</code>), is someone else on
this server (<code>w/who</code>), any eye-catching messages in syslog and <code>dmesg</code>?</p>

<p>There’s little point to carrying on if you have 2000 messages from your RAID
controller about how unhappy it is with its write-through cache.</p>

<p>This doesn’t have to take more than half a minute.
If nothing catches your eye – continue.</p>

<h3>Reproduce</h3>

<p>If there indeed is a problem somewhere, and there’s no low hanging fruit to be
found;</p>

<p>Take all steps you can to try and reproduce the problem. When you can
reproduce, you can observe. <strong>When you can observe, you can solve.</strong>
Ask the person reporting the issue what exact steps to take to reproduce the
issue if it isn’t already obvious or covered by the first section.</p>

<p>Now, for issues caused by solar flares and clients running exclusively on
OS/2, it’s not always feasible to reproduce. But your first port of call
should be to at least try!
In the very beginning, all you know is “X thinks their website is slow”. For
all you know at that point, they could be tethered to their GPRS mobile phone and
applying Windows updates. Delving any deeper than we already have at that
point is, again, a waste of time.</p>

<p>Attempt to reproduce!</p>

<h3>Check the log!</h3>

<p>It saddens me that I felt the need to include this. But I&rsquo;ve seen escalations
that ended mere minutes after someone ran <code>tail /var/log/..</code>
Most *NIX tools these days
are pretty good at logging. Anything blatantly wrong will manifest itself quite
prominently in most application logs. Check it.</p>

<h3>Narrow down</h3>

<p>If there are no obvious issues, but you can reproduce the reported problem,
great.
So, you know the website is slow.
Now you’ve narrowed things down to: Browser rendering/bug, application
code, DNS infrastructure, router, firewall, NICs (all eight+ involved),
ethernet cables, load balancer, database, caching layer, session storage, web
server software, application server, RAM, CPU, RAID card, disks.<br/>
Add a smattering of other potential culprits depending on the set-up. It could
be the SAN, too. And don’t forget about the hardware WAF! And.. you get my
point.</p>

<p>If the issue is time-to-first-byte you’ll of course start applying known fixes
to the webserver, that’s the one responding slowly and what you know the most
about, right? Wrong!<br/>
You go back to trying to reproduce the issue. Only this time, you try to
eliminate as many potential sources of issues as possible.</p>

<p>You can eliminate the vast majority of potential culprits very
easily:
Can you reproduce the issue locally from the server(s)?
Congratulations, you’ve
just saved yourself having to try your fixes for BGP routing.<br/>
If you can’t, try from another machine on the same network.
If you can - at least you can move the firewall down your list of suspects, (but do keep
a suspicious eye on that switch!)</p>

<p>Are all connections slow? Just because the
server is a web server, doesn’t mean you shouldn’t try to reproduce with another
type of service. <a href="http://nc110.sourceforge.net/">netcat</a> is very useful in these scenarios
(but chances are your SSH connection would have been lagging
this whole time, as a clue)! If that’s also slow, you at least know you’ve
most likely got a networking problem and can disregard the entire web
stack and all its components. Start from the top again with this knowledge
(do not collect $200).
Work your way from the inside-out!</p>

<p>Even if you can reproduce locally - there’s still a whole lot of “stuff”
left. Let’s remove a few more variables.
Can you reproduce it with a flat-file? If <code>i_am_a_1kb_file.html</code> is slow,
you know it’s not your DB, caching layer or anything beyond the OS and the webserver
itself.<br/>
Can you reproduce with an interpreted/executed
<code>hello_world.(py|php|js|rb..)</code> file?
If you can, you’ve narrowed things down considerably, and you can focus on
just a handful of things.
If <code>hello_world</code> is served instantly, you’ve still learned a lot! You know
there aren’t any blatant resource constraints, any full queues or stuck
IPC calls anywhere. So it’s something the application is doing or
something it’s communicating with.</p>

<p>Are all pages slow? Or just the ones loading the &ldquo;Live scores feed&rdquo; from a
third party?</p>

<p><strong>What this boils down to is; What’s the smallest amount of “stuff” that you
can involve, and still reproduce the issue?</strong></p>

<p>Our example is a slow web site, but this is equally applicable to almost
any issue. Mail delivery?
Can you deliver locally? To yourself? To &lt;common provider here&gt;?  Test
with small, plaintext messages. Work your way up to the 2MB campaign
blast. STARTTLS and no STARTTLS.
Work your way from the inside-out.</p>

<p>Each one of these steps takes mere seconds each, far quicker than
implementing most “potential” fixes.</p>

<h3>Observe / isolate</h3>

<p>By now, you may already have stumbled across the problem by virtue of being unable to
reproduce when you removed a particular component.</p>

<p>But if you haven&rsquo;t, or you still don&rsquo;t know <strong>why</strong>;
Once you’ve found a way to reproduce the issue with the smallest amount of
“stuff” (technical term) between you and the issue, it’s time to start
isolating and observing.</p>

<p>Bear in mind that many services can be ran in the foreground, and/or have
debugging enabled. For certain classes of issues, it is often hugely helpful to do this.</p>

<p>Here’s also where your traditional armory comes into play. <code>strace</code>, <code>lsof</code>, <code>netstat</code>,
<code>GDB</code>, <code>iotop</code>, <code>valgrind</code>, language profilers (cProfile, xdebug, ruby-prof…).
Those types of tools.</p>

<p>Once you’ve come this far, you rarely end up having to break out profilers or
debugers though.</p>

<p><a href="https://linux.die.net/man/1/strace"><code>strace</code></a> is often a very good place to start.<br/>
You might notice that the application is stuck on a particular <code>read()</code> call
on a socket file descriptor connected to port 3306 somewhere. You’ll know
what to do.<br/>
Move on to MySQL and start from the top again. Low hanging
fruit: “Waiting_for * lock”, deadlocks, max_connections.. Move on to: All
queries? Only writes? Only certain tables? Only certain storage
engines?&hellip;</p>

<p>You might notice that there’s a <code>connect()</code> to an external API resource that
takes five seconds to complete, or even times out. You’ll know what to do.</p>

<p>You might notice that there are 1000 calls to <code>fstat()</code> and <code>open()</code> on the
same couple of files as part of a circular dependency somewhere. You’ll
know what to do.</p>

<p>It might not be any of those particular things, but I promise you, you’ll
notice something.</p>

<p>If you’re only going to take one thing from this section, let it be; learn
to use <code>strace</code>! <strong>Really</strong> learn it, read the <em>whole</em> man page. Don’t even skip
the HISTORY section. <code>man</code> each syscall you don’t already know what it
does. 98% of troubleshooting sessions ends with strace.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Flask - cookie and token sessions simultaneously]]></title>
    <link href="http://northernmost.org/blog//flask-cookie-and-token-sessions-simultaneously/index.html"/>
    <updated>2017-01-06T16:02:00+00:00</updated>
    <id>http://northernmost.org/blog//flask-cookie-and-token-sessions-simultaneously/flask-cookie-and-token-sessions-simultaneously</id>
    <content type="html"><![CDATA[<p>Dealing with sessions in <a href="http://flask.pocoo.org/">Flask</a> applications is
rather simple! There is <a href="http://flask.pocoo.org/snippets/category/sessions/">plenty of choice</a> in pre-rolled implementations that is more or less
plug-and-play.</p>

<p>However, sometimes you may want (or need) to colour outside the lines, where a
cookie-cutter implentation either doesn&rsquo;t work, or gets in the way.</p>

<p>One such scenario is if you have an application which needs to act both as a
web front-end, with your typical cookie-based sessions, as well as an API
endpoint.
Requiring cookies when you&rsquo;re acting as an API endpoint isn&rsquo;t particularly
nice, tokens in the request header is the way to go!
So how can you get Flask sessions to work with both these methods of
identification?</p>

<!--more-->


<p>Perhaps at this point, I should add that you might be best served by
reconsidering your strategy here, and make the API endpoint a distinct
application from the one driving your UI. You can still share all your code for
your models and logic and can even make use of a layer 7 load balancer to deal
with the separation for you.
But be it due to retrofitting, time constraint, legacy or otherwise imposed design.. here goes;</p>

<p>Since Flask is a pretty lightweight framework, it&rsquo;s easily extended or wrestled
into submission. Luckily for us, it offers a pluggable way to write your own
session handling!</p>

<p>I&rsquo;ve put a small example application with a custom session interface <a href="https://github.com/eljrax/flask_token_cookie_sessions">on
GitHub</a>, which
allows what we&rsquo;ve previously discussed. You can either distinguish sessions by
a cookie, if present, or a header of your chosing (cookie trumps header, if
both are present). This header defaults to the
de-facto standard <code>X-Auth-Header</code> in the example, but you can configure this
easily.
For ease of use, the datastore used to store the sessions is memcached. But
it&rsquo;s very easily replaced by any other datastore.</p>

<p>The example is as small and compact as possible while remaining runnable. <strong>There
are no &ldquo;bells and whistles&rdquo; such as actual authentication</strong>, that&rsquo;s for you to handle
outside of the session handler.
You will also most likely want to extend the error checking and handling.</p>

<p>Do note - there&rsquo;s a <a href="https://docs.docker.com/compose/install/">docker-compose</a> file
included in the repository, which will enable you to quickly get up and
running.
Alternatively you can  simply run <code>pip install -r requirements.txt &amp;&amp; ./runserver.py</code> from
within the <code>app/</code> directory, provided that you have the required system
dependencies.</p>

<p>Here&rsquo;s an example of using this session handler with cookies:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>$ docker-compose up -d
</span><span class='line'># -d optional, leave it off to run in the foreground
</span><span class='line'>
</span><span class='line'>$ http http://localhost:9000
</span><span class='line'>HTTP/1.0 200 OK
</span><span class='line'>Content-Length: 64
</span><span class='line'>Content-Type: text/html; charset=utf-8
</span><span class='line'>Date: Fri, 06 Jan 2017 20:22:38 GMT
</span><span class='line'>Server: Werkzeug/0.11.15 Python/2.7.6
</span><span class='line'>Set-Cookie: session=e53941b4-dc32-4e30-902a-a197cd1140b5; Expires=Fri,
</span><span class='line'>06-Jan-2017 20:23:08 GMT; HttpOnly; Path=/
</span><span class='line'>
</span><span class='line'>The random identifier stored with your session is: 05ed02a2-48ef-4c5a-8588-9a87356ddad9
</span><span class='line'>
</span><span class='line'>$ http -b http://localhost:9000 Cookie:session=e53941b4-dc32-4e30-902a-a197cd1140b5 
</span><span class='line'>The random identifier stored with your session is: 05ed02a2-48ef-4c5a-8588-9a87356ddad9</span></code></pre></td></tr></table></div></figure>


<p>Since we don&rsquo;t send a JSON body containing the key <code>token</code>, or set the
<code>X-Auth-Token</code> header, the session handler determines the application should send a cookie.</p>

<p>The example has a session timeout of a mighty 30 seconds (configurable,
obviously).</p>

<p>Now, if we were to behave like an API, on the other hand:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>$ echo '{ "token": "pretend_token"}' | http  --json POST http://localhost:9000
</span><span class='line'>HTTP/1.0 200 OK
</span><span class='line'>Content-Length: 64
</span><span class='line'>Content-Type: text/html; charset=utf-8
</span><span class='line'>Date: Fri, 06 Jan 2017 17:56:33 GMT
</span><span class='line'>Server: Werkzeug/0.11.15 Python/2.7.6
</span><span class='line'>
</span><span class='line'>The random identifier stored with your session is: d4945e3e-21bc-42db-9b1b-a0c941a25ddb
</span><span class='line'>
</span><span class='line'>$ http -b http://localhost:9000 x-auth-token:pretend_token
</span><span class='line'>The random identifier stored with your session is: d4945e3e-21bc-42db-9b1b-a0c941a25ddb
</span><span class='line'>
</span><span class='line'>$ sleep 30
</span><span class='line'>$ http -b http://localhost:9000 x-auth-token:pretend_token
</span><span class='line'>The random identifier stored with your session is: 7095b0eb-1efa-4b75-b9e2-a02c7f6e837b</span></code></pre></td></tr></table></div></figure>


<p>As you can see, we don&rsquo;t get a cookie sent back, because we behaved like an
API client. We can also see that we get a brand new session after the 30 seconds has
elapsed.</p>

<p>The example also comes with a test suite for verification. You can execute this
by simply running <code>make tests</code>:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>$ make tests
</span><span class='line'>docker-compose exec session_example /bin/bash -c "cd /app ; python -m unittest discover -s tests/"
</span><span class='line'>Previously unseen session... Setting identifier
</span><span class='line'>Previously unseen session... Setting identifier
</span><span class='line'>.Previously unseen session... Setting identifier
</span><span class='line'>Previously unseen session... Setting identifier
</span><span class='line'>..Previously unseen session... Setting identifier
</span><span class='line'>...Previously unseen session... Setting identifier
</span><span class='line'>...
</span><span class='line'>----------------------------------------------------------------------
</span><span class='line'>Ran 9 tests in 0.027s
</span><span class='line'>
</span><span class='line'>OK</span></code></pre></td></tr></table></div></figure>


<p>The tests all run in a docker container, so the first time you run it, you&rsquo;ll most likely see an image
being built, and a memcached image being pulled.</p>

<p>Hope this helps someone!</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Swap usage - 5 years later]]></title>
    <link href="http://northernmost.org/blog//swap-usage-5-years-later/index.html"/>
    <updated>2016-12-19T14:28:04+00:00</updated>
    <id>http://northernmost.org/blog//swap-usage-5-years-later/swap-usage-5-years-later</id>
    <content type="html"><![CDATA[<p>Skip to the end for a TL;DR</p>

<p>I&rsquo;ve neglected this blog a bit in the last year or so. I&rsquo;ve written a lot of
documentation and given a lot of training internally at work, so there hasn&rsquo;t
been an enormous amount of time I&rsquo;ve been able or willing to spend on it.
However;</p>

<p>Five years ago, I wrote <a href="http://northernmost.org/blog//find-out-what-is-using-your-swap/">an article</a>
presenting a script which lets you find what processes have pages in your swap memory, and how much they consume.
This article is still by far the most popular one I&rsquo;ve ever written, and it still sees a fair amount of traffic, so I
wanted to write a bit of an updated version, and fill in some of the things I
probably should have mentioned more in depth in the original one.</p>

<p>Let&rsquo;s get one thing out of the way first - the script in the original article
is now redundant. It actually already was redundant in certain cases when I
wrote about it, but now it&rsquo;s <em>really</em> redeundant. To get the same information
in any currently supported distribution,
simply launch <code>top</code>, press <code>f</code>, scroll down to where it says <code>SWAP</code> press space
followed by <code>q</code>. There we go, script redundant.</p>

<p>With that cleared up - what I didn&rsquo;t mention five years ago is <em>why</em> you would
want to know this. The short answer is; you probably don&rsquo;t! At least not for
the reasons most people seem to want to.</p>

<p>When it comes to swap, the Linux virtual memory manager doesn&rsquo;t really deal
with &lsquo;programs&rsquo;. It only deals with pages of memory (commonly 4096 bytes, see
<code>getconf PAGE_SIZE</code> to find out what your system uses).</p>

<p>Instead, when memory is under pressure and the kernel need to decide what pages
to commit to swap, it will do so according to an LRU (Least Recently Used) algorithm.
Well, that&rsquo;s a gross oversimplification. There is a <em>lot</em> of magic going on
there which I won&rsquo;t pretend to know in any greater detail. There is also
<code>mlock()</code> and <code>madvise()</code> which developers can use to influence these things.</p>

<p>But in essence - the VMM will amongst other things deliberately put pages of memory
which are <strong>infrequently</strong> used to swap, to ensure that frequently accessed
memory pages are kept in the much faster, residential memory.</p>

<p>So if you landed on my original article, wanting to find out what program was
misbehaving by seeing what was in swap, like many appear to have done, please
reconsider your troubleshooting strategy!
Besides, a memory page being in swap isn&rsquo;t a problem in and of itself -
it&rsquo;s only when those pages are shifted from RAM onto disk and back that you
experience a slowdown. Pages that were swapped out once, and remain there for
the lifetime of the process it belongs to doesn&rsquo;t contribute to any noticeable
issues.</p>

<p>It&rsquo;s also worth noting, that once a page has been swapped out to disk, it <strong>will not</strong>
be swapped back into RAM again until it need accessing by the process.
Therefore, pages being in swap may be an indicator of a previous memory
pressure issue, rather than one currently in progress.</p>

<p>So how do you know if you&rsquo;re &ldquo;swapping&rdquo; ?</p>

<p>Most of the time you can just tell, you&rsquo;ll notice.
But for more modest swapping; the command <code>sar -B 1 20</code> will print
out some statistics each second for 20 seconds.
If you observe the second and thrid column, you will see how many
pages you swap in and out respectively. If this number is 0 or near 0, you&rsquo;re
unlikely to notice any issues due to swapping.
Not everyone has sar installed - so another command you can run is
<code>vmstat 1 20</code> and look at the <code>si</code> and <code>so</code> columns for Swap In and Swap Out.</p>

<p>So in summary;</p>

<ul>
<li><code>top</code> can show you how much swap a process is using</li>
<li>A process using swap isn&rsquo;t necessarily (or even is rarely) a badly behaved
process</li>
<li>Swap isn&rsquo;t inherently bad, it&rsquo;s only bad when it&rsquo;s used frequently</li>
<li>The presence of pages in swap doesn&rsquo;t necessarily indicate a <em>current</em> memory
resource issue</li>
</ul>

]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[sar rebooting ubuntu]]></title>
    <link href="http://northernmost.org/blog//sar-rebooting-ubuntu/index.html"/>
    <updated>2015-10-30T17:27:45+00:00</updated>
    <id>http://northernmost.org/blog//sar-rebooting-ubuntu/sar-rebooting-ubuntu</id>
    <content type="html"><![CDATA[<p>Today I had a colleague approach me about a oneliner I sent him many months
ago, saying that it kept rebooting a server he was running it on.</p>

<p>It was little more than running <code>sar</code> in a loop, extract some values and
run another command if certain thresholds were exceeded.
Hardly anything that you&rsquo;d think would result in a reboot.</p>

<p>After whittling down the oneliner to the offending command, it turned out that
<code>sar</code> was the culprit.
Some further debugging revealed that sar merely spawns a process called
<code>sadc</code>, which does the actual heavy lifting.</p>

<p>In certain circumstances, if you send SIGINT (ctrl+c, for example) to sar, it
can exit before sadc has done its thing.<br/>
When that happens, sadc becomes an orphan, and /sbin/init being a good little init system, takes
it under its wing and becomes its parent process.</p>

<p>When <code>sadc</code> receives the SIGINT signal, it&rsquo;s signal handler will pass it up to its parent process&hellip; You see
where this is going, right?<br/>
Yep, /sbin/init gets the signal, and does what it should do. Initiates a reboot.</p>

<!--more-->


<p>If you want to reboot an Ubuntu 14.x server, simply run this in a terminal (as root,
this is NOT a DoS/vulnerability, merely a bug):</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>root@elcrashtest:~# echo $(sar -b 1 5)
</span><span class='line'>^C
</span><span class='line'>root@elcrashtest:~# ^C
</span><span class='line'>root@elcrashtest:~#
</span><span class='line'>Broadcast message from root@elcrashtest
</span><span class='line'>    (unknown) at 18:06 ...
</span><span class='line'>
</span><span class='line'>    The system is going down for reboot NOW!
</span><span class='line'>    Control-Alt-Delete pressed</span></code></pre></td></tr></table></div></figure>


<p>Rapidly hitting ctrl+c twice does the trick.<br/>
Obviously this command doesn&rsquo;t make sense to run in isolation, but the bug was
hit in the context of a more involved oneliner, and being in a subprocess seem
to trigger it more often.
You may need to run it a couple of times as a few
things need to line up for it to happen. The above command reboots the server
like 8-9/10 times.</p>

<p>If executed in another subshell, you only need to hit ctrl+c once to trigger
it.</p>

<p>A more unrealistic, but sure-fire way to trigger it looks like this:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>root@elcrashtest:~# sar -b 1 100 &gt; /dev/null &
</span><span class='line'>[1] 3777
</span><span class='line'>root@elcrashtest:~# kill -SIGKILL $! ; kill -SIGINT $(pidof sadc);
</span><span class='line'>Broadcast message from root@elcrashtest
</span><span class='line'>...</span></code></pre></td></tr></table></div></figure>


<p>Basically killing sar forcefully (thus orphaning sadc), and then send SIGINT to
sadc. This has a 100% success rate.</p>

<p>This was <a href="https://github.com/sysstat/sysstat/blob/master/sadc.c#L241-L248">fixed in 2014</a>, but Canonical has neglected to backport it.<br/>
A colleague of mine, who is a much better OSS citizen than I am, has
<a href="https://bugs.launchpad.net/ubuntu/+source/sysstat/+bug/1511778">raised this with Canonical</a></p>

<p>I only tested this on Ubuntu 14.04 and 14.10. Debian and RedHat/CentOS does not appear
to suffer from this. It&rsquo;s surprising that it&rsquo;s still present in Ubuntu Trusty, since this is backported in Debian Jessie.</p>

<p>Only on a Friday afternoon&hellip;</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[GRE tunnels and UFW]]></title>
    <link href="http://northernmost.org/blog//gre-tunnels-and-ufw/index.html"/>
    <updated>2015-09-14T19:17:10+01:00</updated>
    <id>http://northernmost.org/blog//gre-tunnels-and-ufw/gre-tunnels-and-ufw</id>
    <content type="html"><![CDATA[<p>Today I wrote an Ansible playbook to set up an environment for a docker demo I
will be giving shortly.
In the demo I will be using three hosts, and I want the containers to be able
to speak to each other across hosts.
To this end, I&rsquo;m using <a href="http://openvswitch.org/">Open vSwitch</a>. The setup is
quite straight forward, set up the bridge, get the meshed GRE tunnels up and off you
go.<br/>
I first set this up in a lab, with firewalls disabled. But knowing that
I will give the demo on public infrastructure, I still wrote the play
to allow everything on a particular interface (an isolated cloud-network)
through UFW.<br/>
When I ran my playbook against a few cloud servers, the containers couldn&rsquo;t
talk to each other on account of the GRE tunnels not working.</p>

<!--more-->


<p>So I enabled logging in UFW, and soon started seeing these types of entries</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>[UFW BLOCK] IN=eth2 OUT= MAC=&lt;redacted&gt;
</span><span class='line'>SRC=&lt;redacted&gt; DST=&lt;redacted&gt; LEN=76 TOS=0x00 PREC=0x00 TTL=64 ID=36639 DF
</span><span class='line'>PROTO=47</span></code></pre></td></tr></table></div></figure>


<p>Upon checking which rule actually dropped the packets (<code>iptables -L -nv</code>), it
transpired that the culprit was</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>1    97 DROP       all  --  *      *       0.0.0.0/0            0.0.0.0/0
</span><span class='line'>ctstate INVALID</span></code></pre></td></tr></table></div></figure>


<p>It turns out that a
<a href="http://marc.info/?l=netfilter&amp;m=142195881513706&amp;w=2">change</a> in the 3.18 kernel and onwards means
that unless either of the <code>nf_conntrack_pptp</code> or <code>nf_conntrack_proto_gre</code>
modules are loaded, any GRE packets will be marked as INVALID, as opposed to
NEW and subsequently ESTABLISHED.</p>

<p>So in order to get openvswitch working with UFW, there are two solutions; Either
explicitly allow <a href="https://en.wikipedia.org/wiki/List_of_IP_protocol_numbers">protocol 47</a>, or load one of the aforementioned kernel modules.</p>

<p>Should you go for the former solution, this is the rule you need to beat to the
punch:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
</pre></td><td class='code'><pre><code class='BASH'><span class='line'><span class="nv">$ </span>grep -A <span class="m">2</span> <span class="s2">&quot;drop INVALID&quot;</span> /etc/ufw/before.rules
</span><span class='line'><span class="c"># drop INVALID packets (logs these in loglevel medium and higher)</span>
</span><span class='line'>-A ufw-before-input -m conntrack --ctstate INVALID -j ufw-logging-deny
</span><span class='line'>-A ufw-before-input -m conntrack --ctstate INVALID -j DROP
</span></code></pre></td></tr></table></div></figure>


<p>with <code>-A ufw-before-input -p 47 -i $iface -j ACCEPT</code></p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[LVM thinpool for docker storage on Fedora 22]]></title>
    <link href="http://northernmost.org/blog//lvm-thinpool-for-docker-storage-on-fedora-22/index.html"/>
    <updated>2015-09-08T10:31:58+01:00</updated>
    <id>http://northernmost.org/blog//lvm-thinpool-for-docker-storage-on-fedora-22/lvm-thinpool-for-docker-storage-on-fedora-22</id>
    <content type="html"><![CDATA[<p><strong>TL;DR</strong>: You can use <code>docker-storage-setup</code> without root fs being on LVM by
passing DEVS and VG environment variables to the script or editing
/etc/sysconfig/docker-storage-setup</p>

<p>I stumbled across this article the other day <a href="http://www.projectatomic.io/blog/2015/06/notes-on-fedora-centos-and-docker-storage-drivers/">&lsquo;Friends Don&rsquo;t Let Friends Run
Docker on Loopback in
Production&rsquo;</a></p>

<p>I also saw this <a href="https://bugzilla.redhat.com/show_bug.cgi?id=1260189">bug</a>
being raised, saying docker-storage-setup doesn&rsquo;t work with the Fedora 22 cloud
image, as the root fs isn&rsquo;t on LVM.</p>

<!--more-->


<p>I decided to try this out, so I created some block storage and a Fedora 22 VM
on the Rackspace cloud:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
</pre></td><td class='code'><pre><code class='BASH'><span class='line'><span class="nv">$ </span>cinder create --display-name docker-storage --volume-type 1fd376b5-c84e-43c5-a66b-d895cb75ac2c 75
</span><span class='line'><span class="c"># Verify that it&#39;s built and is available</span>
</span><span class='line'><span class="nv">$ </span>cinder show 359b01b7-541c-4f4d-b2e7-279d778079a4
</span><span class='line'><span class="c"># Build a Fedora 22 server with the volume attached</span>
</span><span class='line'>nova boot --image 2cc5db1b-2fc8-42ae-8afb-d30c68037f02 <span class="se">\</span>
</span><span class='line'>--flavor performance1-1 <span class="se">\</span>
</span><span class='line'>--block-device-mapping <span class="nv">xvdb</span><span class="o">=</span>359b01b7-541c-4f4d-b2e7-279d778079a4 <span class="se">\</span>
</span><span class='line'>docker-storage-test
</span></code></pre></td></tr></table></div></figure>


<p>Once on the machine, I followed the article above:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class='BASH'><span class='line'><span class="nv">$ </span>dnf -y install docker
</span><span class='line'><span class="nv">$ </span>systemctl stop docker
</span><span class='line'><span class="nv">$ </span>rm -rf /var/lib/docker/
</span></code></pre></td></tr></table></div></figure>


<p>And here&rsquo;s where the bug report I linked earlier comes into play.
<code>docker-storage-setup</code> is just a bash script, and if you just take a look at this
output:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
</pre></td><td class='code'><pre><code class='BASH'><span class='line'>docker-storage-setup --help
</span><span class='line'>Usage: /usr/bin/docker-storage-setup <span class="o">[</span>OPTIONS<span class="o">]</span>
</span><span class='line'>
</span><span class='line'>Grows the root filesystem and sets up storage <span class="k">for</span> docker.
</span><span class='line'>
</span><span class='line'>Options:
</span><span class='line'>  -h, --help            Print <span class="nb">help </span>message.
</span></code></pre></td></tr></table></div></figure>


<p>It sure gives the impresson of only doing one single thing - growing the root
FS!
As the bug rightly points out, the Fedora cloud image doesn&rsquo;t come with LVM for
the root FS (which is a good thing!), so there&rsquo;s no VG for this script to grow.</p>

<p>So unless you <a href="https://raw.githubusercontent.com/projectatomic/docker-storage-setup/master/docker-storage-setup.sh">read the
script</a>, or the manpage, you wouldn&rsquo;t necessarily notice that what
<code>--help</code> says is just the default behaviour, and you can use
<code>docker-storage-setup</code> to use an emphemeral disk and leave the root fs alone.
The kicker lies in two environment variables (as opposed to
arguments to the script itself, like is more common); <code>$DEVS</code> and <code>$VG</code>.
If you supply both of those, and the disk you give in DEVS has no partition
table and the VG you supply doesn&rsquo;t exist, the script will partition the disk
and create all the necessary bits for LVM on that disk:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
<span class='line-number'>24</span>
<span class='line-number'>25</span>
<span class='line-number'>26</span>
<span class='line-number'>27</span>
<span class='line-number'>28</span>
<span class='line-number'>29</span>
<span class='line-number'>30</span>
<span class='line-number'>31</span>
<span class='line-number'>32</span>
<span class='line-number'>33</span>
<span class='line-number'>34</span>
<span class='line-number'>35</span>
<span class='line-number'>36</span>
<span class='line-number'>37</span>
<span class='line-number'>38</span>
<span class='line-number'>39</span>
<span class='line-number'>40</span>
<span class='line-number'>41</span>
<span class='line-number'>42</span>
<span class='line-number'>43</span>
<span class='line-number'>44</span>
<span class='line-number'>45</span>
<span class='line-number'>46</span>
<span class='line-number'>47</span>
<span class='line-number'>48</span>
<span class='line-number'>49</span>
</pre></td><td class='code'><pre><code class='BASH'><span class='line'><span class="c"># Verify that ephemeral disk has no partition table:</span>
</span><span class='line'><span class="nv">$ </span>partx -s /dev/xvdb
</span><span class='line'>partx: /dev/xvdb: failed to <span class="nb">read </span>partition table
</span><span class='line'>
</span><span class='line'><span class="c"># Start lvmetad</span>
</span><span class='line'><span class="nv">$ </span>systemctl start lvm2-lvmetad
</span><span class='line'>
</span><span class='line'><span class="nv">$ DEVS</span><span class="o">=</span><span class="s2">&quot;/dev/xvdb&quot;</span> <span class="nv">VG</span><span class="o">=</span><span class="s2">&quot;docker-data&quot;</span> docker-storage-setup
</span><span class='line'>  Volume group <span class="s2">&quot;xvda1&quot;</span> not found
</span><span class='line'>  Cannot process volume group xvda1
</span><span class='line'>Checking that no-one is using this disk right now ... OK
</span><span class='line'>
</span><span class='line'>Disk /dev/xvdb: <span class="m">75</span> GiB, <span class="m">80530636800</span> bytes, <span class="m">157286400</span> sectors
</span><span class='line'>Units: sectors of <span class="m">1</span> * <span class="nv">512</span> <span class="o">=</span> <span class="m">512</span> bytes
</span><span class='line'>Sector size <span class="o">(</span>logical/physical<span class="o">)</span>: <span class="m">512</span> bytes / <span class="m">512</span> bytes
</span><span class='line'>I/O size <span class="o">(</span>minimum/optimal<span class="o">)</span>: <span class="m">512</span> bytes / <span class="m">512</span> bytes
</span><span class='line'>
</span><span class='line'>&gt;&gt;&gt; Script header accepted.
</span><span class='line'>&gt;&gt;&gt; Created a new DOS disklabel with disk identifier 0x2b7ebb69.
</span><span class='line'>Created a new partition <span class="m">1</span> of <span class="nb">type</span> <span class="s1">&#39;Linux LVM&#39;</span> and of size <span class="m">75</span> GiB.
</span><span class='line'>/dev/xvdb2:
</span><span class='line'>New situation:
</span><span class='line'>
</span><span class='line'>Device     Boot Start       End   Sectors Size Id Type
</span><span class='line'>/dev/xvdb1       <span class="m">2048</span> <span class="m">157286399</span> <span class="m">157284352</span>  75G 8e Linux LVM
</span><span class='line'>
</span><span class='line'>The partition table has been altered.
</span><span class='line'>Calling ioctl<span class="o">()</span> to re-read partition table.
</span><span class='line'>Syncing disks.
</span><span class='line'>  Physical volume <span class="s2">&quot;/dev/xvdb1&quot;</span> successfully created
</span><span class='line'>  Volume group <span class="s2">&quot;docker-data&quot;</span> successfully created
</span><span class='line'>  Rounding up size to full physical extent 80.00 MiB
</span><span class='line'>  Logical volume <span class="s2">&quot;docker-poolmeta&quot;</span> created.
</span><span class='line'>  Logical volume <span class="s2">&quot;docker-pool&quot;</span> created.
</span><span class='line'>  WARNING: Converting logical volume docker-data/docker-pool and docker-data/docker-poolmeta to pool<span class="err">&#39;</span>s data and metadata volumes.
</span><span class='line'>  THIS WILL DESTROY CONTENT OF LOGICAL VOLUME <span class="o">(</span>filesystem etc.<span class="o">)</span>
</span><span class='line'>  Converted docker-data/docker-pool to thin pool.
</span><span class='line'>  Logical volume <span class="s2">&quot;docker-pool&quot;</span> changed.
</span><span class='line'>
</span><span class='line'><span class="c"># Verify that the script wrote the docker-storage file</span>
</span><span class='line'><span class="nv">$ </span>cat /etc/sysconfig/docker-storage
</span><span class='line'><span class="nv">DOCKER_STORAGE_OPTIONS</span><span class="o">=</span>--storage-driver devicemapper --storage-opt dm.fs<span class="o">=</span>xfs
</span><span class='line'>--storage-opt dm.thinpooldev<span class="o">=</span>/dev/mapper/docker--data-docker--pool
</span><span class='line'>--storage-opt dm.use_deferred_removal<span class="o">=</span><span class="nb">true</span>
</span><span class='line'>
</span><span class='line'><span class="c"># Verify that the LV is there:</span>
</span><span class='line'><span class="nv">$ </span>lvs
</span><span class='line'>  LV          VG          Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
</span><span class='line'>  docker-pool docker-data twi-a-t--- 44.95g             0.00   0.07
</span></code></pre></td></tr></table></div></figure>


<p>So now the script has created the LV thinpool, and written the required docker
configuration.</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
<span class='line-number'>24</span>
<span class='line-number'>25</span>
<span class='line-number'>26</span>
<span class='line-number'>27</span>
</pre></td><td class='code'><pre><code class='BASH'><span class='line'><span class="nv">$ </span>systemctl start docker
</span><span class='line'><span class="nv">$ </span>docker info
</span><span class='line'>Containers: 0
</span><span class='line'>Images: 0
</span><span class='line'>Storage Driver: devicemapper
</span><span class='line'> Pool Name: docker--data-docker--pool
</span><span class='line'> Pool Blocksize: 524.3 kB
</span><span class='line'> Backing Filesystem: extfs
</span><span class='line'> Data file:
</span><span class='line'> Metadata file:
</span><span class='line'> Data Space Used: 19.92 MB
</span><span class='line'> Data Space Total: 48.26 GB
</span><span class='line'> Data Space Available: 48.24 GB
</span><span class='line'> Metadata Space Used: 65.54 kB
</span><span class='line'> Metadata Space Total: 83.89 MB
</span><span class='line'> Metadata Space Available: 83.82 MB
</span><span class='line'> Udev Sync Supported: <span class="nb">true</span>
</span><span class='line'><span class="nb"> </span>Deferred Removal Enabled: <span class="nb">true</span>
</span><span class='line'><span class="nb"> </span>Library Version: 1.02.93 <span class="o">(</span>2015-01-30<span class="o">)</span>
</span><span class='line'>Execution Driver: native-0.2
</span><span class='line'>Logging Driver: json-file
</span><span class='line'>Kernel Version: 4.0.8-300.fc22.x86_64
</span><span class='line'>Operating System: Fedora <span class="m">22</span> <span class="o">(</span>Twenty Two<span class="o">)</span>
</span><span class='line'>CPUs: 1
</span><span class='line'>Total Memory: 987.8 MiB
</span><span class='line'>Name: docker-storage-test
</span><span class='line'>ID: EYKV:Q5D6:4F3Y:Z5X3:ZILX:ZBVI:2YF6:VHD7:RFQS:IWWO:MOFL:EWO7
</span></code></pre></td></tr></table></div></figure>


<p>No trace of /dev/loop0!
And to verify that it&rsquo;s actually using our thinpool:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
</pre></td><td class='code'><pre><code class='BASH'><span class='line'><span class="nv">$ </span>lvdisplay <span class="p">|</span> egrep <span class="s2">&quot;Allocated pool data&quot;</span> <span class="p">;</span> du -sh /var/lib/docker/ <span class="p">;</span> docker pull centos:6 <span class="p">;</span> du -sh /var/lib/docker <span class="p">;</span> lvdisplay <span class="p">|</span> egrep <span class="s2">&quot;Allocated pool data&quot;</span>
</span><span class='line'>  Allocated pool data    0.04%
</span><span class='line'>5.6M    total
</span><span class='line'>6: Pulling from docker.io/centos
</span><span class='line'>47d44cb6f252: Pull <span class="nb">complete</span>
</span><span class='line'>6a7b54515901: Pull <span class="nb">complete</span>
</span><span class='line'>e788880c8cfa: Pull <span class="nb">complete</span>
</span><span class='line'>1debf8fb53e6: Pull <span class="nb">complete</span>
</span><span class='line'>72703a0520b7: Already exists
</span><span class='line'>docker.io/centos:6: The image you are pulling has been verified. Important: image verification is a tech preview feature and should not be relied on to provide security.
</span><span class='line'>Digest: sha256:5436a8b20d6cdf638d936ce1486e277294f6a1360a7b630b9ef76b30d9a88aec
</span><span class='line'>Status: Downloaded newer image <span class="k">for</span> docker.io/centos:6
</span><span class='line'>5.8M    total
</span><span class='line'>  Allocated pool data    0.53%
</span></code></pre></td></tr></table></div></figure>


<p>In conclusion - the script could definitely do with being updated to using
command line arguments for this, rather than environment variables, and update
the &ndash;help output to highlight this.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[readdir and directories on xfs]]></title>
    <link href="http://northernmost.org/blog//readdir-and-directories-on-xfs/index.html"/>
    <updated>2015-01-16T13:34:46+00:00</updated>
    <id>http://northernmost.org/blog//readdir-and-directories-on-xfs/readdir-and-directories-on-xfs</id>
    <content type="html"><![CDATA[<p>Recently I had some pretty unexpected results from a piece of code I wrote quite a while ago, and never had any issues with.
I ran my program on a brand new CentOS 7 installation, and the results weren&rsquo;t at all what I was used to!</p>

<p>Consider the following code (abridged and simplified):</p>

<figure class='code'><figcaption><span>readdir_xfs.c</span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
<span class='line-number'>24</span>
<span class='line-number'>25</span>
<span class='line-number'>26</span>
<span class='line-number'>27</span>
<span class='line-number'>28</span>
<span class='line-number'>29</span>
<span class='line-number'>30</span>
<span class='line-number'>31</span>
<span class='line-number'>32</span>
<span class='line-number'>33</span>
<span class='line-number'>34</span>
<span class='line-number'>35</span>
<span class='line-number'>36</span>
<span class='line-number'>37</span>
<span class='line-number'>38</span>
<span class='line-number'>39</span>
<span class='line-number'>40</span>
<span class='line-number'>41</span>
<span class='line-number'>42</span>
<span class='line-number'>43</span>
<span class='line-number'>44</span>
<span class='line-number'>45</span>
<span class='line-number'>46</span>
<span class='line-number'>47</span>
<span class='line-number'>48</span>
</pre></td><td class='code'><pre><code class='C'><span class='line'><span class="cp">#include &lt;stdio.h&gt;</span>
</span><span class='line'><span class="cp">#include &lt;dirent.h&gt;</span>
</span><span class='line'><span class="cp">#include &lt;sys/types.h&gt;</span>
</span><span class='line'>
</span><span class='line'><span class="kt">void</span> <span class="nf">recursive_dir</span><span class="p">(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">path</span><span class="p">){</span>
</span><span class='line'>
</span><span class='line'>  <span class="kt">DIR</span> <span class="o">*</span><span class="n">dir</span><span class="p">;</span>
</span><span class='line'>  <span class="k">struct</span> <span class="n">dirent</span> <span class="o">*</span><span class="n">de</span><span class="p">;</span>
</span><span class='line'>
</span><span class='line'>  <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="p">(</span><span class="n">dir</span> <span class="o">=</span> <span class="n">opendir</span><span class="p">(</span><span class="n">path</span><span class="p">))){</span>
</span><span class='line'>    <span class="n">perror</span><span class="p">(</span><span class="s">&quot;opendir&quot;</span><span class="p">);</span>
</span><span class='line'>    <span class="k">return</span><span class="p">;</span>
</span><span class='line'>  <span class="p">}</span>
</span><span class='line'>  <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="p">(</span><span class="n">de</span> <span class="o">=</span> <span class="n">readdir</span><span class="p">(</span><span class="n">dir</span><span class="p">))){</span>
</span><span class='line'>    <span class="n">perror</span><span class="p">(</span><span class="s">&quot;readdir&quot;</span><span class="p">);</span>
</span><span class='line'>    <span class="k">return</span><span class="p">;</span>
</span><span class='line'>  <span class="p">}</span>
</span><span class='line'>
</span><span class='line'>  <span class="k">do</span> <span class="p">{</span>
</span><span class='line'>
</span><span class='line'>    <span class="k">if</span> <span class="p">(</span><span class="n">strncmp</span> <span class="p">(</span><span class="n">de</span><span class="o">-&gt;</span><span class="n">d_name</span><span class="p">,</span> <span class="s">&quot;.&quot;</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span> <span class="o">||</span> <span class="n">strncmp</span> <span class="p">(</span><span class="n">de</span><span class="o">-&gt;</span><span class="n">d_name</span><span class="p">,</span> <span class="s">&quot;..&quot;</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
</span><span class='line'>      <span class="k">continue</span><span class="p">;</span>
</span><span class='line'>    <span class="p">}</span>
</span><span class='line'>
</span><span class='line'>    <span class="k">if</span> <span class="p">(</span><span class="n">de</span><span class="o">-&gt;</span><span class="n">d_type</span> <span class="o">==</span> <span class="n">DT_DIR</span><span class="p">){</span>
</span><span class='line'>      <span class="kt">char</span> <span class="n">full_path</span><span class="p">[</span><span class="n">PATH_MAX</span><span class="p">];</span>
</span><span class='line'>      <span class="n">snprintf</span><span class="p">(</span><span class="n">full_path</span><span class="p">,</span> <span class="n">PATH_MAX</span><span class="p">,</span> <span class="s">&quot;%s/%s&quot;</span><span class="p">,</span> <span class="n">path</span><span class="p">,</span> <span class="n">de</span><span class="o">-&gt;</span><span class="n">d_name</span><span class="p">);</span>
</span><span class='line'>      <span class="n">printf</span><span class="p">(</span><span class="s">&quot;Dir: %s</span><span class="se">\n</span><span class="s">&quot;</span><span class="p">,</span> <span class="n">full_path</span><span class="p">);</span>
</span><span class='line'>      <span class="n">recursive_dir</span><span class="p">(</span><span class="n">full_path</span><span class="p">);</span>
</span><span class='line'>    <span class="p">}</span>
</span><span class='line'>    <span class="k">else</span> <span class="p">{</span>
</span><span class='line'>      <span class="n">printf</span><span class="p">(</span><span class="s">&quot;</span><span class="se">\t</span><span class="s">File: %s%s</span><span class="se">\n</span><span class="s">&quot;</span><span class="p">,</span> <span class="n">path</span><span class="p">,</span> <span class="n">de</span><span class="o">-&gt;</span><span class="n">d_name</span><span class="p">);</span>
</span><span class='line'>    <span class="p">}</span>
</span><span class='line'>  <span class="p">}</span> <span class="k">while</span> <span class="p">(</span><span class="n">de</span> <span class="o">=</span> <span class="n">readdir</span><span class="p">(</span><span class="n">dir</span><span class="p">));</span>
</span><span class='line'>  <span class="n">closedir</span><span class="p">(</span><span class="n">dir</span><span class="p">);</span>
</span><span class='line'>
</span><span class='line'><span class="p">}</span>
</span><span class='line'>
</span><span class='line'><span class="kt">int</span> <span class="nf">main</span><span class="p">(</span><span class="kt">int</span> <span class="n">argc</span><span class="p">,</span> <span class="kt">char</span> <span class="o">*</span><span class="n">argv</span><span class="p">[]){</span>
</span><span class='line'>
</span><span class='line'>  <span class="k">if</span> <span class="p">(</span><span class="n">argc</span> <span class="o">&lt;</span> <span class="mi">2</span><span class="p">){</span>
</span><span class='line'>    <span class="n">fprintf</span><span class="p">(</span><span class="n">stderr</span><span class="p">,</span> <span class="s">&quot;Usage: %s &lt;dir&gt;</span><span class="se">\n</span><span class="s">&quot;</span><span class="p">,</span> <span class="n">argv</span><span class="p">[</span><span class="mi">0</span><span class="p">]);</span>
</span><span class='line'>    <span class="k">return</span> <span class="mi">1</span><span class="p">;</span>
</span><span class='line'>  <span class="p">}</span>
</span><span class='line'>
</span><span class='line'>  <span class="n">recursive_dir</span><span class="p">(</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">]);</span>
</span><span class='line'>  <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
</span><span class='line'><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<!--more-->


<p>Pretty straight forward - reads directories, prints out them and the files within them.
Now here&rsquo;s the kicker:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
</pre></td><td class='code'><pre><code class='BASH'><span class='line'><span class="nv">$ </span>gcc -g dirtraverse.c -o dirtraverse <span class="o">&amp;&amp;</span> ./dirtraverse /data_ext4/
</span><span class='line'>Dir: /data_ext4//dir1
</span><span class='line'>        File: /data_ext4//dir1file3
</span><span class='line'>        File: /data_ext4//dir1file1
</span><span class='line'>        File: /data_ext4//dir1file2
</span><span class='line'>Dir: /data_ext4//dir2
</span><span class='line'>        File: /data_ext4//dir2file1
</span><span class='line'>Dir: /data_ext4//dir3
</span><span class='line'><span class="nv">$ </span>rsync -a --delete /data_ext4/ /data_xfs/  <span class="c"># Ensure directories are identical</span>
</span><span class='line'><span class="nv">$ </span>gcc -g dirtraverse.c -o dirtraverse <span class="o">&amp;&amp;</span> ./dirtraverse /data_xfs/
</span><span class='line'>        File: /data_xfs/dir1
</span><span class='line'>        File: /data_xfs/dir2
</span><span class='line'>        File: /data_xfs/dir3
</span></code></pre></td></tr></table></div></figure>


<p>No traversal?</p>

<p>After a bit of head scratching, and a few debug statements, I found that when using readdir(3) on XFS, dirent->d_type is always 0! No matter what type of file it is.
This means that line #25 can never be true.</p>

<p>To be fair though, the manpage states that POSIX only mandates dirent->d_name.</p>

<p>So to be absolutely sure your directory traversal code is more portable, make use of stat(2) and the S_ISDIR() macro!</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[How does MySQL hide the command line password in ps?]]></title>
    <link href="http://northernmost.org/blog//how-does-mysql-hide-the-command-line-password-in-ps/index.html"/>
    <updated>2012-03-10T05:03:46+00:00</updated>
    <id>http://northernmost.org/blog//how-does-mysql-hide-the-command-line-password-in-ps/how-does-mysql-hide-the-command-line-password-in-ps</id>
    <content type="html"><![CDATA[<p>I saw this question asked today, and thought I&rsquo;d write a quick post about it.
Giving passwords on the command line isn&rsquo;t necessarily a fantastic idea - but you can sort of see where they&rsquo;re coming from. Configuration files and environment variables are better, but just slightly. Security is a night mare!</p>

<p>But if you do decide to write an application which takes a password (or any other sensitive information) on the command line, you can prevent other users on the system from easily seeing it like this:</p>

<!--more-->




<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
</pre></td><td class='code'><pre><code class='C'><span class='line'><span class="cp">#include &lt;stdio.h&gt;</span>
</span><span class='line'><span class="cp">#include &lt;unistd.h&gt;</span>
</span><span class='line'><span class="cp">#include &lt;string.h&gt;</span>
</span><span class='line'><span class="cp">#include &lt;sys/types.h&gt;</span>
</span><span class='line'>
</span><span class='line'><span class="kt">int</span> <span class="nf">main</span><span class="p">(</span><span class="kt">int</span> <span class="n">argc</span><span class="p">,</span> <span class="kt">char</span> <span class="o">*</span><span class="n">argv</span><span class="p">[]){</span>
</span><span class='line'>
</span><span class='line'>    <span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span><span class='line'>    <span class="kt">pid_t</span> <span class="n">mypid</span> <span class="o">=</span> <span class="n">getpid</span><span class="p">();</span>
</span><span class='line'>    <span class="k">if</span> <span class="p">(</span><span class="n">argc</span> <span class="o">==</span> <span class="mi">1</span><span class="p">)</span>
</span><span class='line'>        <span class="k">return</span> <span class="mi">1</span><span class="p">;</span>
</span><span class='line'>    <span class="n">printf</span><span class="p">(</span><span class="s">&quot;argc = %d and arguments are:</span><span class="se">\n</span><span class="s">&quot;</span><span class="p">,</span> <span class="n">argc</span><span class="p">);</span>
</span><span class='line'>    <span class="k">for</span> <span class="p">(</span><span class="n">i</span> <span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">argc</span> <span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span>
</span><span class='line'>        <span class="n">printf</span><span class="p">(</span><span class="s">&quot;%d = %s</span><span class="se">\n</span><span class="s">&quot;</span> <span class="p">,</span><span class="n">i</span><span class="p">,</span> <span class="n">argv</span><span class="p">[</span><span class="n">i</span><span class="p">]);</span>
</span><span class='line'>    <span class="n">printf</span><span class="p">(</span><span class="s">&quot;Replacing first argument with x:es... Now open another terminal and run: ps p %d</span><span class="se">\n</span><span class="s">&quot;</span><span class="p">,</span> <span class="p">(</span><span class="kt">int</span><span class="p">)</span><span class="n">mypid</span><span class="p">);</span>
</span><span class='line'>    <span class="n">fflush</span><span class="p">(</span><span class="n">stdout</span><span class="p">);</span>
</span><span class='line'>    <span class="n">memset</span><span class="p">(</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="sc">&#39;x&#39;</span><span class="p">,</span> <span class="n">strlen</span><span class="p">(</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">]));</span>
</span><span class='line'>    <span class="n">getc</span><span class="p">(</span><span class="n">stdin</span><span class="p">);</span>
</span><span class='line'>        <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
</span><span class='line'><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<p>A sample run looks like this:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class='C'><span class='line'><span class="err">$</span> <span class="p">.</span><span class="o">/</span><span class="n">pwhide</span> <span class="n">abcd</span>
</span><span class='line'><span class="n">argc</span> <span class="o">=</span> <span class="mi">2</span> <span class="n">and</span> <span class="n">arguments</span> <span class="nl">are</span><span class="p">:</span>
</span><span class='line'><span class="mi">0</span> <span class="o">=</span> <span class="p">.</span><span class="o">/</span><span class="n">pwhide</span>
</span><span class='line'><span class="mi">1</span> <span class="o">=</span> <span class="n">abcd</span>
</span><span class='line'><span class="n">Replacing</span> <span class="n">first</span> <span class="n">argument</span> <span class="n">with</span> <span class="nl">x</span><span class="p">:</span><span class="n">es</span><span class="p">...</span> <span class="n">Now</span> <span class="nl">run</span><span class="p">:</span> <span class="n">ps</span> <span class="n">p</span> <span class="mi">27913</span>
</span></code></pre></td></tr></table></div></figure>


<p>In another terminal:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class='C'><span class='line'><span class="err">$</span> <span class="n">ps</span> <span class="n">p</span> <span class="mi">27913</span>
</span><span class='line'>  <span class="n">PID</span> <span class="n">TTY</span>      <span class="n">STAT</span>   <span class="n">TIME</span> <span class="n">COMMAND</span>
</span><span class='line'><span class="mi">27913</span> <span class="n">pts</span><span class="o">/</span><span class="mi">1</span>    <span class="n">S</span><span class="o">+</span>     <span class="mi">0</span><span class="o">:</span><span class="mo">00</span> <span class="p">.</span><span class="o">/</span><span class="n">pwhide</span> <span class="n">xxxx</span>
</span></code></pre></td></tr></table></div></figure>


<p>In the interest of brevity, the above code isn&rsquo;t very portable - but it works on Linux and hopefully the point of it comes across. In other environments, such as FreeBSD, you have the setproctitle() syscall to do the dirty work for you. The key thing here is the overwriting of argv[1]
Because the size of argv[] is allocated when the program starts, you can&rsquo;t easily obfuscate the length of the password. I say easily - because of course <a href="http://stupefydeveloper.blogspot.com/2008/10/linux-change-process-name.html">there is a way</a>.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Font rendering - no more jealousy]]></title>
    <link href="http://northernmost.org/blog//font-rendering-no-more-jealousy/index.html"/>
    <updated>2012-02-28T20:02:17+00:00</updated>
    <id>http://northernmost.org/blog//font-rendering-no-more-jealousy/font-rendering-no-more-jealousy</id>
    <content type="html"><![CDATA[<p>I suppose this kind of content is what most people use twitter for these days. But since I&rsquo;ve remained strong and stayed well away from that, I suppose I will have to be a tad retro and write a short blog post about it.
If you like me are an avid <a href="http://fedoraproject.org">Fedora</a> user, I&rsquo;m sure you&rsquo;ve thrown glances at colleague&rsquo;s or friend&rsquo;s Ubuntu machines and thought that there was something that was slightly different about the way it looked (aside from the obvious Gnome vs Unity differences). Shinier somehow&hellip;;  So had I, but I mainly dismissed it as a case of &ldquo;the grass is always greener&hellip;&rdquo;.</p>

<p>It turns out that the grass actually IS greener.</p>

<!--more-->


<p>Tonight I stumbled upon <a href="http://www.infinality.net/blog/infinality-freetype-patches/">this</a>. It&rsquo;s a patched version of freetype. For what I assume are political reasons (free as in speech), Fedora ships a Freetype version without subpixel rendering. These patches fixes that and <a href="http://www.infinality.net/forum/viewtopic.php?f=2&t=18">other things</a>.</p>

<p>With a default configuration file of 407 lines, it&rsquo;s quite extensible and configurable as well. Lucky, I quite like the default!</p>

<p>If you&rsquo;re not entirely happy with the way your fonts look on Fedora - it&rsquo;s well worth a look</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Transactions and code testing]]></title>
    <link href="http://northernmost.org/blog//transactions-and-code-testing/index.html"/>
    <updated>2011-08-18T13:08:29+01:00</updated>
    <id>http://northernmost.org/blog//transactions-and-code-testing/transactions-and-code-testing</id>
    <content type="html"><![CDATA[<p>A little while ago I worked with a customer to migrate their DB from using MyISAM to InnoDB (something I definitely don&rsquo;t mind doing!)
I set up a smaller test instance with all tables using the InnoDB engine as part of the testing. I instructed them to thoroughly test their application against this test instance and let me know if they identified any issues.</p>

<p>They reported back that everything seemed fine, and we went off to do the actual migration. Everything went according to plan and things seemed well.
After a while they started seeing some discrepancies in the stock portion of their application. The data didn&rsquo;t add up with what they expected and stock levels seemed surprisingly high. A crontabbed program was responsible for periodically updating the stock count of products, so this was of course the first place I looked.
I ran it manually and looked at its output; it was very verbose and reported some 2000 products had been updated. But looking at the actual DB, this was far from the case.</p>

<!--more-->


<p>Still having the test environment available, I ran it a few times against that and could see the <code>com_update</code> and <code>com_insert</code> counters being incremented, so I knew the queries were making it there. But the data remained intact. At this point, I had a gut feeling what was going on.. so to confirm this, I enabled query logging to see what was actually going on. It didn&rsquo;t take me long to spot the problem. On the second line of the log, I saw this:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>       40 Query set autocommit=0</span></code></pre></td></tr></table></div></figure>


<p>The program responsible for updating the stock levels was a python script using <a href="http://mysql-python.sourceforge.net/">MySQLDB</a>. I couldn&rsquo;t see any traces of autocommit being set explicitly, so I went on assuming that it was off by default (which turned out to be <a href="http://www.python.org/dev/peps/pep-0249/">correct</a>). After adding <code>cursor.commit()*</code> after the relevant queries had been sent to the server, everything was back to normal as far as stock levels were concerned.
Since the code itself was seeing its own transaction, calls such as <code>cursor.rowcount</code> which the testers had relied on were all correct.</p>

<p>But the lesson here; when testing your software from a database point of view, don&rsquo;t blindly trust what your code tells you it&rsquo;s done, make sure it&rsquo;s actually done it by verifying the data!
A lot of things can happen to data between your program and the platters. Its transaction can deadlock and be rolled back, it can be reading cached data, it can get lost in a crashing message queue, etc.</p>

<p>As a rule of thumb, I&rsquo;m rather against setting a blanket <code>autocommit=1</code> in code, I&rsquo;ve seen that come back to haunt developers in the past. I&rsquo;m a strong advocate for explicit transaction handling.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Find out what is using your swap]]></title>
    <link href="http://northernmost.org/blog//find-out-what-is-using-your-swap/index.html"/>
    <updated>2011-05-27T16:46:40+01:00</updated>
    <id>http://northernmost.org/blog//find-out-what-is-using-your-swap/find-out-what-is-using-your-swap</id>
    <content type="html"><![CDATA[<p><strong>This article is now over five years old, please consider reading a more
<a href="http://northernmost.org/blog//swap-usage-5-years-later/">recent version</a></strong></p>

<p>Have you ever logged in to a server, ran <code>free</code>, seen that a bit of swap is used and wondered what&rsquo;s in there? It&rsquo;s usually not very indicative of anything, or even overly helpful knowing what&rsquo;s in there, mostly it&rsquo;s a curiosity thing.</p>

<p>Either way, starting from kernel 2.6.16, we can find out using smaps which can be found in the proc filesystem. I&rsquo;ve written a simple bash script which prints out all running processes and their swap usage.
It&rsquo;s quick and dirty, but does the job and can easily be modified to work on any info exposed in /proc/$PID/smaps
If I find the time and inspiration, I might tidy it up and extend it a bit to cover some more alternatives. The output is in kilobytes.</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
</pre></td><td class='code'><pre><code class='bash'><span class='line'><span class="c">#!/bin/bash</span>
</span><span class='line'><span class="c"># Get current swap usage for all running processes</span>
</span><span class='line'><span class="c"># Erik Ljungstrom 27/05/2011</span>
</span><span class='line'><span class="nv">SUM</span><span class="o">=</span>0
</span><span class='line'><span class="nv">OVERALL</span><span class="o">=</span>0
</span><span class='line'><span class="k">for</span> DIR in <span class="sb">`</span>find /proc/ -maxdepth <span class="m">1</span> -type d <span class="p">|</span> egrep <span class="s2">&quot;^/proc/[0-9]&quot;</span><span class="sb">`</span> <span class="p">;</span> <span class="k">do</span>
</span><span class='line'>        <span class="nv">PID</span><span class="o">=</span><span class="sb">`</span><span class="nb">echo</span> <span class="nv">$DIR</span> <span class="p">|</span> cut -d / -f 3<span class="sb">`</span>
</span><span class='line'>        <span class="nv">PROGNAME</span><span class="o">=</span><span class="sb">`</span>ps -p <span class="nv">$PID</span> -o comm --no-headers<span class="sb">`</span>
</span><span class='line'>        <span class="k">for</span> SWAP in <span class="sb">`</span>grep Swap <span class="nv">$DIR</span>/smaps 2&gt;/dev/null<span class="p">|</span> awk <span class="s1">&#39;{ print $2 }&#39;</span><span class="sb">`</span>
</span><span class='line'>        <span class="k">do</span>
</span><span class='line'>                <span class="nb">let </span><span class="nv">SUM</span><span class="o">=</span><span class="nv">$SUM</span>+<span class="nv">$SWAP</span>
</span><span class='line'>        <span class="k">done</span>
</span><span class='line'>        <span class="nb">echo</span> <span class="s2">&quot;PID=$PID - Swap used: $SUM - ($PROGNAME )&quot;</span>
</span><span class='line'>        <span class="nb">let </span><span class="nv">OVERALL</span><span class="o">=</span><span class="nv">$OVERALL</span>+<span class="nv">$SUM</span>
</span><span class='line'>        <span class="nv">SUM</span><span class="o">=</span>0
</span><span class='line'>
</span><span class='line'><span class="k">done</span>
</span><span class='line'><span class="nb">echo</span> <span class="s2">&quot;Overall swap used: $OVERALL&quot;</span>
</span></code></pre></td></tr></table></div></figure>


<p><em>This will need to be ran as root</em> for it to be able to gather accurate numbers. It will still work even if you don&rsquo;t, but it will report 0 for any processes not owned by your user.
Needless to say, it&rsquo;s Linux only. The output is ordered alphabetically according to your locale (which admittedly isn&rsquo;t a great thing since we&rsquo;re dealing with numbers), but you can easily apply your standard shell magic to the output. For instance, to find the process with most swap used, just run the script like so:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='bash'><span class='line'><span class="nv">$ </span>./getswap.sh <span class="p">|</span> sort -n -k <span class="m">5</span>
</span></code></pre></td></tr></table></div></figure>


<p>Don&rsquo;t want to see stuff that&rsquo;s not using swap at all?</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='bash'><span class='line'><span class="nv">$ </span>./getswap.sh  <span class="p">|</span> egrep -v <span class="s2">&quot;Swap used: 0&quot;</span> <span class="p">|</span>sort -n -k 5
</span></code></pre></td></tr></table></div></figure>


<p>&hellip;; and so on and so forth</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Example using Cassandra with Thrift in C++]]></title>
    <link href="http://northernmost.org/blog//example-using-cassandra-with-thrift-in-c-plus-plus/index.html"/>
    <updated>2011-05-21T20:09:46+01:00</updated>
    <id>http://northernmost.org/blog//example-using-cassandra-with-thrift-in-c-plus-plus/example-using-cassandra-with-thrift-in-c-plus-plus</id>
    <content type="html"><![CDATA[<p>Due to a very exciting, recently launched project at work, I&rsquo;ve had to interface with Cassandra through C++ code. As anyone who has done this can testify, the API docs are vague at best, and there are very few examples out there. The constant API changes between 0.x versions and the fact that the Cassandra API has its <a href="http://wiki.apache.org/cassandra/API">docs</a> and Thrift has <a href="http://wiki.apache.org/thrift/">its own</a>, but there is nothing bridging the two isn&rsquo;t helpful either.
So at the moment it is very much a case of dissecting header files and looking at implementation in the Thrift generated source files.</p>

<p>The only somewhat useful example of using Cassandra with C++ one can find online is <a href="http://posulliv.github.com/2010/02/22/cpp-cassandra.html">this</a>, but due to the API changes, this is now outdated (it&rsquo;s still worth a read).</p>

<p>So in the hope that nobody else will have to spend the better part of a day piecing things together to achieve even the most basic thing, here&rsquo;s an example which works with Cassandra 0.7 and Thrift 0.6.</p>

<!--more-->


<p>First of all, create a new keyspace and a column family, using cassandra-cli:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>[default@unknown] create keyspace nm_example;
</span><span class='line'>c647b2c0-83e2-11e0-9eb2-e700f669bcfc
</span><span class='line'>Waiting for schema agreement...
</span><span class='line'>... schemas agree across the cluster
</span><span class='line'>[default@unknown] use nm_example;
</span><span class='line'>Authenticated to keyspace: nm_example
</span><span class='line'>[default@nm_example] create column family nm_cfamily with comparator=BytesType and default_validation_class=BytesType;
</span><span class='line'>30466721-83e3-11e0-9eb2-e700f669bcfc
</span><span class='line'>Waiting for schema agreement...
</span><span class='line'>... schemas agree across the cluster
</span><span class='line'>[default@nm_example]
</span></code></pre></td></tr></table></div></figure>


<p>Now go to the directory where you have cassandra installed and enter the <strong>interface/</strong> directory and run: <code>thrift -gen cpp cassandra.thrift</code>
This will create the <code>gen-cpp/</code> directory. From this directory, you need to copy all files bar the Cassandra_server.skeleton.cpp one to wherever you intend to keep your sources.
Here&rsquo;s some example code which inserts, retrieves, updates, retrieves and deletes keys:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
<span class='line-number'>24</span>
<span class='line-number'>25</span>
<span class='line-number'>26</span>
<span class='line-number'>27</span>
<span class='line-number'>28</span>
<span class='line-number'>29</span>
<span class='line-number'>30</span>
<span class='line-number'>31</span>
<span class='line-number'>32</span>
<span class='line-number'>33</span>
<span class='line-number'>34</span>
<span class='line-number'>35</span>
<span class='line-number'>36</span>
<span class='line-number'>37</span>
<span class='line-number'>38</span>
<span class='line-number'>39</span>
<span class='line-number'>40</span>
<span class='line-number'>41</span>
<span class='line-number'>42</span>
<span class='line-number'>43</span>
<span class='line-number'>44</span>
<span class='line-number'>45</span>
<span class='line-number'>46</span>
<span class='line-number'>47</span>
<span class='line-number'>48</span>
<span class='line-number'>49</span>
<span class='line-number'>50</span>
<span class='line-number'>51</span>
<span class='line-number'>52</span>
<span class='line-number'>53</span>
<span class='line-number'>54</span>
<span class='line-number'>55</span>
<span class='line-number'>56</span>
<span class='line-number'>57</span>
<span class='line-number'>58</span>
<span class='line-number'>59</span>
<span class='line-number'>60</span>
<span class='line-number'>61</span>
<span class='line-number'>62</span>
<span class='line-number'>63</span>
<span class='line-number'>64</span>
<span class='line-number'>65</span>
<span class='line-number'>66</span>
<span class='line-number'>67</span>
<span class='line-number'>68</span>
<span class='line-number'>69</span>
<span class='line-number'>70</span>
<span class='line-number'>71</span>
<span class='line-number'>72</span>
<span class='line-number'>73</span>
<span class='line-number'>74</span>
<span class='line-number'>75</span>
<span class='line-number'>76</span>
<span class='line-number'>77</span>
<span class='line-number'>78</span>
<span class='line-number'>79</span>
<span class='line-number'>80</span>
<span class='line-number'>81</span>
<span class='line-number'>82</span>
<span class='line-number'>83</span>
<span class='line-number'>84</span>
<span class='line-number'>85</span>
<span class='line-number'>86</span>
<span class='line-number'>87</span>
<span class='line-number'>88</span>
<span class='line-number'>89</span>
<span class='line-number'>90</span>
<span class='line-number'>91</span>
</pre></td><td class='code'><pre><code class='C++'><span class='line'><span class="cp">#include &quot;Cassandra.h&quot;</span>
</span><span class='line'>
</span><span class='line'><span class="cp">#include &lt;protocol/TBinaryProtocol.h&gt;</span>
</span><span class='line'><span class="cp">#include &lt;thrift/transport/TSocket.h&gt;</span>
</span><span class='line'><span class="cp">#include &lt;thrift/transport/TTransportUtils.h&gt;</span>
</span><span class='line'>
</span><span class='line'><span class="k">using</span> <span class="k">namespace</span> <span class="n">std</span><span class="p">;</span>
</span><span class='line'><span class="k">using</span> <span class="k">namespace</span> <span class="n">apache</span><span class="o">::</span><span class="n">thrift</span><span class="p">;</span>
</span><span class='line'><span class="k">using</span> <span class="k">namespace</span> <span class="n">apache</span><span class="o">::</span><span class="n">thrift</span><span class="o">::</span><span class="n">protocol</span><span class="p">;</span>
</span><span class='line'><span class="k">using</span> <span class="k">namespace</span> <span class="n">apache</span><span class="o">::</span><span class="n">thrift</span><span class="o">::</span><span class="n">transport</span><span class="p">;</span>
</span><span class='line'><span class="k">using</span> <span class="k">namespace</span> <span class="n">org</span><span class="o">::</span><span class="n">apache</span><span class="o">::</span><span class="n">cassandra</span><span class="p">;</span>
</span><span class='line'><span class="k">using</span> <span class="k">namespace</span> <span class="n">boost</span><span class="p">;</span>
</span><span class='line'>
</span><span class='line'><span class="k">static</span> <span class="n">string</span> <span class="nf">host</span><span class="p">(</span><span class="s">&quot;127.0.0.1&quot;</span><span class="p">);</span>
</span><span class='line'><span class="k">static</span> <span class="kt">int</span> <span class="n">port</span><span class="o">=</span> <span class="mi">9160</span><span class="p">;</span>
</span><span class='line'>
</span><span class='line'><span class="kt">int64_t</span> <span class="nf">getTS</span><span class="p">(){</span>
</span><span class='line'>    <span class="cm">/* If you&#39;re doing things quickly, you may want to make use of tv_usec</span>
</span><span class='line'><span class="cm">     * or something here instead</span>
</span><span class='line'><span class="cm">     */</span>
</span><span class='line'>    <span class="kt">time_t</span> <span class="n">ltime</span><span class="p">;</span>
</span><span class='line'>    <span class="n">ltime</span><span class="o">=</span><span class="n">time</span><span class="p">(</span><span class="nb">NULL</span><span class="p">);</span>
</span><span class='line'>    <span class="k">return</span> <span class="p">(</span><span class="kt">int64_t</span><span class="p">)</span><span class="n">ltime</span><span class="p">;</span>
</span><span class='line'>
</span><span class='line'><span class="p">}</span>
</span><span class='line'>
</span><span class='line'><span class="kt">int</span> <span class="nf">main</span><span class="p">(){</span>
</span><span class='line'>    <span class="n">shared_ptr</span><span class="o">&lt;</span><span class="n">TTransport</span><span class="o">&gt;</span> <span class="n">socket</span><span class="p">(</span><span class="k">new</span> <span class="n">TSocket</span><span class="p">(</span><span class="n">host</span><span class="p">,</span> <span class="n">port</span><span class="p">));</span>
</span><span class='line'>    <span class="n">shared_ptr</span><span class="o">&lt;</span><span class="n">TTransport</span><span class="o">&gt;</span> <span class="n">transport</span><span class="p">(</span><span class="k">new</span> <span class="n">TFramedTransport</span><span class="p">(</span><span class="n">socket</span><span class="p">));</span>
</span><span class='line'>    <span class="n">shared_ptr</span><span class="o">&lt;</span><span class="n">TProtocol</span><span class="o">&gt;</span> <span class="n">protocol</span><span class="p">(</span><span class="k">new</span> <span class="n">TBinaryProtocol</span><span class="p">(</span><span class="n">transport</span><span class="p">));</span>
</span><span class='line'>    <span class="n">CassandraClient</span> <span class="n">client</span><span class="p">(</span><span class="n">protocol</span><span class="p">);</span>
</span><span class='line'>
</span><span class='line'>    <span class="k">const</span> <span class="n">string</span><span class="o">&amp;</span><span class="err">#</span><span class="mo">03</span><span class="mi">8</span><span class="p">;</span> <span class="n">key</span><span class="o">=</span><span class="s">&quot;your_key&quot;</span><span class="p">;</span>
</span><span class='line'>
</span><span class='line'>    <span class="n">ColumnPath</span> <span class="n">cpath</span><span class="p">;</span>
</span><span class='line'>    <span class="n">ColumnParent</span> <span class="n">cp</span><span class="p">;</span>
</span><span class='line'>
</span><span class='line'>    <span class="n">ColumnOrSuperColumn</span> <span class="n">csc</span><span class="p">;</span>
</span><span class='line'>    <span class="n">Column</span> <span class="n">c</span><span class="p">;</span>
</span><span class='line'>
</span><span class='line'>    <span class="n">c</span><span class="p">.</span><span class="n">name</span><span class="p">.</span><span class="n">assign</span><span class="p">(</span><span class="s">&quot;column_name&quot;</span><span class="p">);</span>
</span><span class='line'>    <span class="n">c</span><span class="p">.</span><span class="n">value</span><span class="p">.</span><span class="n">assign</span><span class="p">(</span><span class="s">&quot;Data for our key to go into column_name&quot;</span><span class="p">);</span>
</span><span class='line'>    <span class="n">c</span><span class="p">.</span><span class="n">timestamp</span> <span class="o">=</span> <span class="n">getTS</span><span class="p">();</span>
</span><span class='line'>    <span class="n">c</span><span class="p">.</span><span class="n">ttl</span> <span class="o">=</span> <span class="mi">300</span><span class="p">;</span>
</span><span class='line'>
</span><span class='line'>    <span class="n">cp</span><span class="p">.</span><span class="n">column_family</span><span class="p">.</span><span class="n">assign</span><span class="p">(</span><span class="s">&quot;nm_cfamily&quot;</span><span class="p">);</span>
</span><span class='line'>    <span class="n">cp</span><span class="p">.</span><span class="n">super_column</span><span class="p">.</span><span class="n">assign</span><span class="p">(</span><span class="s">&quot;&quot;</span><span class="p">);</span>
</span><span class='line'>
</span><span class='line'>    <span class="n">cpath</span><span class="p">.</span><span class="n">column_family</span><span class="p">.</span><span class="n">assign</span><span class="p">(</span><span class="s">&quot;nm_cfamily&quot;</span><span class="p">);</span>
</span><span class='line'>    <span class="cm">/* This is required - thrift &#39;feature&#39; */</span>
</span><span class='line'>    <span class="n">cpath</span><span class="p">.</span><span class="n">__isset</span><span class="p">.</span><span class="n">column</span> <span class="o">=</span> <span class="nb">true</span><span class="p">;</span>
</span><span class='line'>    <span class="n">cpath</span><span class="p">.</span><span class="n">column</span><span class="o">=</span><span class="s">&quot;column_name&quot;</span><span class="p">;</span>
</span><span class='line'>
</span><span class='line'>    <span class="n">try</span> <span class="p">{</span>
</span><span class='line'>        <span class="n">transport</span><span class="o">-&gt;</span><span class="n">open</span><span class="p">();</span>
</span><span class='line'>        <span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;Set keyspace to &#39;dpdns&#39;..&quot;</span> <span class="o">&lt;&lt;</span> <span class="n">endl</span><span class="p">;</span>
</span><span class='line'>        <span class="n">client</span><span class="p">.</span><span class="n">set_keyspace</span><span class="p">(</span><span class="s">&quot;nm_example&quot;</span><span class="p">);</span>
</span><span class='line'>
</span><span class='line'>        <span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;Insert key &#39;&quot;</span> <span class="o">&lt;&lt;</span> <span class="n">key</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;&#39; in column &#39;&quot;</span> <span class="o">&lt;&lt;</span> <span class="n">c</span><span class="p">.</span><span class="n">name</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;&#39; in column family &#39;&quot;</span> <span class="o">&lt;&lt;</span> <span class="n">cp</span><span class="p">.</span><span class="n">column_family</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;&#39; with timestamp &quot;</span> <span class="o">&lt;&lt;</span> <span class="n">c</span><span class="p">.</span><span class="n">timestamp</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;...&quot;</span> <span class="o">&lt;&lt;</span> <span class="n">endl</span><span class="p">;</span>
</span><span class='line'>        <span class="n">client</span><span class="p">.</span><span class="n">insert</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="n">cp</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">org</span><span class="o">::</span><span class="n">apache</span><span class="o">::</span><span class="n">cassandra</span><span class="o">::</span><span class="n">ConsistencyLevel</span><span class="o">::</span><span class="n">ONE</span><span class="p">);</span>
</span><span class='line'>
</span><span class='line'>        <span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;Retrieve key &#39;&quot;</span> <span class="o">&lt;&lt;</span> <span class="n">key</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;&#39; from column &#39;&quot;</span> <span class="o">&lt;&lt;</span> <span class="n">cpath</span><span class="p">.</span><span class="n">column</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;&#39; in column family &#39;&quot;</span> <span class="o">&lt;&lt;</span> <span class="n">cpath</span><span class="p">.</span><span class="n">column_family</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;&#39; again...&quot;</span> <span class="o">&lt;&lt;</span> <span class="n">endl</span><span class="p">;</span>
</span><span class='line'>        <span class="n">client</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">csc</span><span class="p">,</span> <span class="n">key</span><span class="p">,</span> <span class="n">cpath</span><span class="p">,</span> <span class="n">org</span><span class="o">::</span><span class="n">apache</span><span class="o">::</span><span class="n">cassandra</span><span class="o">::</span><span class="n">ConsistencyLevel</span><span class="o">::</span><span class="n">ONE</span><span class="p">);</span>
</span><span class='line'>        <span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;Value read is &#39;&quot;</span> <span class="o">&lt;&lt;</span> <span class="n">csc</span><span class="p">.</span><span class="n">column</span><span class="p">.</span><span class="n">value</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;&#39;...&quot;</span> <span class="o">&lt;&lt;</span> <span class="n">endl</span><span class="p">;</span>
</span><span class='line'>
</span><span class='line'>        <span class="n">c</span><span class="p">.</span><span class="n">timestamp</span><span class="o">++</span><span class="p">;</span>
</span><span class='line'>        <span class="n">c</span><span class="p">.</span><span class="n">value</span><span class="p">.</span><span class="n">assign</span><span class="p">(</span><span class="s">&quot;Updated data going into column_name&quot;</span><span class="p">);</span>
</span><span class='line'>        <span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;Update key &#39;&quot;</span> <span class="o">&lt;&lt;</span> <span class="n">key</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;&#39; in column with timestamp &quot;</span> <span class="o">&lt;&lt;</span> <span class="n">c</span><span class="p">.</span><span class="n">timestamp</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;...&quot;</span> <span class="o">&lt;&lt;</span> <span class="n">endl</span><span class="p">;</span>
</span><span class='line'>        <span class="n">client</span><span class="p">.</span><span class="n">insert</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="n">cp</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">org</span><span class="o">::</span><span class="n">apache</span><span class="o">::</span><span class="n">cassandra</span><span class="o">::</span><span class="n">ConsistencyLevel</span><span class="o">::</span><span class="n">ONE</span><span class="p">);</span>
</span><span class='line'>
</span><span class='line'>        <span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;Retrieve updated key &#39;&quot;</span> <span class="o">&lt;&lt;</span> <span class="n">key</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;&#39; from column &#39;&quot;</span> <span class="o">&lt;&lt;</span> <span class="n">cpath</span><span class="p">.</span><span class="n">column</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;&#39; in column family &#39;&quot;</span> <span class="o">&lt;&lt;</span> <span class="n">cpath</span><span class="p">.</span><span class="n">column_family</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;&#39; again...&quot;</span> <span class="o">&lt;&lt;</span> <span class="n">endl</span><span class="p">;</span>
</span><span class='line'>        <span class="n">client</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">csc</span><span class="p">,</span> <span class="n">key</span><span class="p">,</span> <span class="n">cpath</span><span class="p">,</span> <span class="n">org</span><span class="o">::</span><span class="n">apache</span><span class="o">::</span><span class="n">cassandra</span><span class="o">::</span><span class="n">ConsistencyLevel</span><span class="o">::</span><span class="n">ONE</span><span class="p">);</span>
</span><span class='line'>        <span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;Updated value is: &#39;&quot;</span> <span class="o">&lt;&lt;</span> <span class="n">csc</span><span class="p">.</span><span class="n">column</span><span class="p">.</span><span class="n">value</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;&#39;&quot;</span> <span class="o">&lt;&lt;</span> <span class="n">endl</span><span class="p">;</span>
</span><span class='line'>
</span><span class='line'>        <span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;Remove the key &#39;&quot;</span> <span class="o">&lt;&lt;</span> <span class="n">key</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;&#39; we just retrieved. Value &#39;&quot;</span> <span class="o">&lt;&lt;</span> <span class="n">csc</span><span class="p">.</span><span class="n">column</span><span class="p">.</span><span class="n">value</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;&#39; timestamp &quot;</span> <span class="o">&lt;&lt;</span> <span class="n">csc</span><span class="p">.</span><span class="n">column</span><span class="p">.</span><span class="n">timestamp</span> <span class="o">&lt;&lt;</span> <span class="s">&quot; ...&quot;</span> <span class="o">&lt;&lt;</span> <span class="n">endl</span><span class="p">;</span>
</span><span class='line'>        <span class="n">client</span><span class="p">.</span><span class="n">remove</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="n">cpath</span><span class="p">,</span> <span class="n">csc</span><span class="p">.</span><span class="n">column</span><span class="p">.</span><span class="n">timestamp</span><span class="p">,</span> <span class="n">org</span><span class="o">::</span><span class="n">apache</span><span class="o">::</span><span class="n">cassandra</span><span class="o">::</span><span class="n">ConsistencyLevel</span><span class="o">::</span><span class="n">ONE</span><span class="p">);</span>
</span><span class='line'>
</span><span class='line'>        <span class="n">transport</span><span class="o">-&gt;</span><span class="n">close</span><span class="p">();</span>
</span><span class='line'>    <span class="p">}</span>
</span><span class='line'>    <span class="k">catch</span> <span class="p">(</span><span class="n">NotFoundException</span> <span class="o">&amp;</span><span class="err">#</span><span class="mo">03</span><span class="mi">8</span><span class="p">;</span><span class="n">nf</span><span class="p">){</span>
</span><span class='line'>        <span class="n">cerr</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;NotFoundException ERROR: &quot;</span><span class="o">&lt;&lt;</span> <span class="n">nf</span><span class="p">.</span><span class="n">what</span><span class="p">()</span> <span class="o">&lt;&lt;</span> <span class="n">endl</span><span class="p">;</span>
</span><span class='line'>    <span class="p">}</span>
</span><span class='line'>    <span class="k">catch</span> <span class="p">(</span><span class="n">InvalidRequestException</span> <span class="o">&amp;</span><span class="err">#</span><span class="mo">03</span><span class="mi">8</span><span class="p">;</span><span class="n">re</span><span class="p">)</span> <span class="p">{</span>
</span><span class='line'>        <span class="n">cerr</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;InvalidRequest ERROR: &quot;</span> <span class="o">&lt;&lt;</span> <span class="n">re</span><span class="p">.</span><span class="n">why</span> <span class="o">&lt;&lt;</span> <span class="n">endl</span><span class="p">;</span>
</span><span class='line'>    <span class="p">}</span>
</span><span class='line'>    <span class="k">catch</span> <span class="p">(</span><span class="n">TException</span> <span class="o">&amp;</span><span class="err">#</span><span class="mo">03</span><span class="mi">8</span><span class="p">;</span><span class="n">tx</span><span class="p">)</span> <span class="p">{</span>
</span><span class='line'>        <span class="n">cerr</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;TException ERROR: &quot;</span> <span class="o">&lt;&lt;</span> <span class="n">tx</span><span class="p">.</span><span class="n">what</span><span class="p">()</span> <span class="o">&lt;&lt;</span> <span class="n">endl</span><span class="p">;</span>
</span><span class='line'>    <span class="p">}</span>
</span><span class='line'>
</span><span class='line'>    <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
</span><span class='line'><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<p>Say we&rsquo;ve called the file cassandra_example.cpp, and you have the files mentioned above in the same directory, you can comile things like this:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
</pre></td><td class='code'><pre><code class='C++'><span class='line'><span class="err">$</span> <span class="n">g</span><span class="o">++</span> <span class="o">-</span><span class="n">lthrift</span> <span class="o">-</span><span class="n">Wall</span>  <span class="n">cassandra_example</span><span class="p">.</span><span class="n">cpp</span> <span class="n">cassandra_constants</span><span class="p">.</span><span class="n">cpp</span> <span class="n">Cassandra</span><span class="p">.</span><span class="n">cpp</span> <span class="n">cassandra_types</span><span class="p">.</span><span class="n">cpp</span> <span class="o">-</span><span class="n">o</span> <span class="n">cassandra_example</span>
</span><span class='line'><span class="err">$</span> <span class="p">.</span><span class="o">/</span><span class="n">cassandra_example</span>
</span><span class='line'><span class="n">Set</span> <span class="n">keyspace</span> <span class="n">to</span> <span class="err">&#39;</span><span class="n">nm_example</span><span class="err">&#39;</span><span class="p">..</span>
</span><span class='line'><span class="n">Insert</span> <span class="n">key</span> <span class="err">&#39;</span><span class="n">your_key</span><span class="err">&#39;</span> <span class="n">in</span> <span class="n">column</span> <span class="err">&#39;</span><span class="n">column_name</span><span class="err">&#39;</span> <span class="n">in</span> <span class="n">column</span> <span class="n">family</span> <span class="err">&#39;</span><span class="n">nm_cfamily</span><span class="err">&#39;</span> <span class="n">with</span> <span class="n">timestamp</span> <span class="mf">1306008338.</span><span class="p">..</span>
</span><span class='line'><span class="n">Retrieve</span> <span class="n">key</span> <span class="err">&#39;</span><span class="n">your_key</span><span class="err">&#39;</span> <span class="n">from</span> <span class="n">column</span> <span class="err">&#39;</span><span class="n">column_name</span><span class="err">&#39;</span> <span class="n">in</span> <span class="n">column</span> <span class="n">family</span> <span class="err">&#39;</span><span class="n">nm_cfamily</span><span class="err">&#39;</span> <span class="n">again</span><span class="p">...</span>
</span><span class='line'><span class="n">Value</span> <span class="n">read</span> <span class="n">is</span> <span class="err">&#39;</span><span class="n">Data</span> <span class="k">for</span> <span class="n">our</span> <span class="n">key</span> <span class="n">to</span> <span class="n">go</span> <span class="n">into</span> <span class="n">column_name</span><span class="err">&#39;</span><span class="p">...</span>
</span><span class='line'><span class="n">Update</span> <span class="n">key</span> <span class="err">&#39;</span><span class="n">your_key</span><span class="err">&#39;</span> <span class="n">in</span> <span class="n">column</span> <span class="n">with</span> <span class="n">timestamp</span> <span class="mf">1306008339.</span><span class="p">..</span>
</span><span class='line'><span class="n">Retrieve</span> <span class="n">updated</span> <span class="n">key</span> <span class="err">&#39;</span><span class="n">your_key</span><span class="err">&#39;</span> <span class="n">from</span> <span class="n">column</span> <span class="err">&#39;</span><span class="n">column_name</span><span class="err">&#39;</span> <span class="n">in</span> <span class="n">column</span> <span class="n">family</span> <span class="err">&#39;</span><span class="n">nm_cfamily</span><span class="err">&#39;</span> <span class="n">again</span><span class="p">...</span>
</span><span class='line'><span class="n">Updated</span> <span class="n">value</span> <span class="nl">is</span><span class="p">:</span> <span class="err">&#39;</span><span class="n">Updated</span> <span class="n">data</span> <span class="n">going</span> <span class="n">into</span> <span class="n">column_name</span><span class="err">&#39;</span>
</span><span class='line'><span class="n">Remove</span> <span class="n">the</span> <span class="n">key</span> <span class="err">&#39;</span><span class="n">your_key</span><span class="err">&#39;</span> <span class="n">we</span> <span class="n">just</span> <span class="n">retrieved</span><span class="p">.</span> <span class="n">Value</span> <span class="err">&#39;</span><span class="n">Updated</span> <span class="n">data</span> <span class="n">going</span> <span class="n">into</span> <span class="n">column_name</span><span class="err">&#39;</span> <span class="n">timestamp</span> <span class="mi">1306008339</span> <span class="p">...</span>
</span></code></pre></td></tr></table></div></figure>


<p>Another thing worth mentioning is <a href="http://posulliv.com/">Padraig O'Sullivan&rsquo;s</a> <a href="https://github.com/posulliv/libcassandra">libcassandra</a>, which may or may not be worth a look depending on what you want to do and what versions of Thrift and Cassandra you&rsquo;re tied to.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Site slow after scaling out? Yeah, possibly!]]></title>
    <link href="http://northernmost.org/blog//site-slow-after-scaling-out-yeah-possibly/index.html"/>
    <updated>2011-03-29T06:03:46+01:00</updated>
    <id>http://northernmost.org/blog//site-slow-after-scaling-out-yeah-possibly/site-slow-after-scaling-out-yeah-possibly</id>
    <content type="html"><![CDATA[<p>Every now and then, we have customers who outgrow their single server setup. The next natural step is of course splitting the web layer from the DB layer. So they get another server, and move the database to that.</p>

<p>So far so good! A week or so later, we often get the call <em>&ldquo;Our page load time is higher now than before the upgrade! We&rsquo;ve got twice as much hardware, and it&rsquo;s slower! You have broken it!&rdquo;</em>
It&rsquo;s easy to see where they&rsquo;re coming from. It makes sense, right?</p>

<p>That is until you factor in the newly introduced network topology! Today it&rsquo;s not unusual (that&rsquo;s not to say it&rsquo;s acceptable or optimal) for your average
wordpress/drupal/joomla/otherspawnofsatan site to run 40-50 queries per page load. Quite often even more!</p>

<!--more-->


<p>Based on a tcpdump session of a reasonably average query (if there is such a thing), connecting to a server, authenticating, sending a query and receiving a 5 row result set of 1434 bytes yields 25 packets being sent between my laptop and a remote DB server on the same wired, non-congested network. A normal, average latency of TCP/IP over Ethernet is ~0.2 ms for the size of packets we&rsquo;re talking here.
So, doing the maths, you&rsquo;re seeing <code>25*0.2*50=250ms</code> in just network latency per page load for your SQL queries. This is obviously a lot more than you see over a local UNIX socket.</p>

<p>This is inevitable, laws of physics. It is nothing you, your sysadmin and/or your hosting company can do anything about. There may however be something your developer can do about the amount of queries!
You also shouldn&rsquo;t confuse response-times with availability. Your response times may be slower, but you can (hopefully) serve a lot more users with this setup!</p>

<p>Sure, there are <a href="http://www.dolphinics.com/">technologies</a> out there which have considerably less latency than ethernet, but they come with quite the price-tag, and there are more often than not quite a few avenues to go down before it makes sense to start looking at that kind of thing.</p>

<p>You could also potentially looking at running the full stack on both machines using master/master replication for your DBs, and load balance your front-ends and have them both read locally, but only write to one node at a time! That kind of DB scenario is something fairly easily set up using <a href="http://mysql-mmm.org/">mmm</a> for MySQL. But in my experience, this often ends up more costly and potentially introducing more complexities than it solves.
I&rsquo;m an avid advocate for keeping server roles separate as much as possible!</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[A look at mysql-5-5 semi-synchronous replication]]></title>
    <link href="http://northernmost.org/blog//a-look-at-mysql-5-5-semi-synchronous-replication/index.html"/>
    <updated>2010-10-09T20:19:54+01:00</updated>
    <id>http://northernmost.org/blog//a-look-at-mysql-5-5-semi-synchronous-replication/a-look-at-mysql-5-5-semi-synchronous-replication</id>
    <content type="html"><![CDATA[<p>Now that MySQL 5.5 is in RC, I decided to have a look at the semi synchronous replication. It’s easy to get going, and from my very initial tests appear to be working a treat.</p>

<p>This mode of replication is called semi synchronous due to the fact that it only guarantees that at least one of the slaves have written the transaction to disk in its relay log, not actually committed it to its data files. It guarantees that the data exists by some means somewhere, but not that it’s retrievable through a MySQL client.</p>

<p>Semi sync is available as a plugin, and if you compile from source, you’ll need to do <code>–with-plugins=semisync….</code>
So far, the semisync plugin can only be built as a dynamic module, so you’ll need to install it once you’ve got your instance up and running. To do this, you do as with any other plugin:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>install plugin rpl_semi_sync_master soname 'semisync_master.so';
</span><span class='line'>install plugin rpl_semi_sync_slave soname 'semisync_slave.so';</span></code></pre></td></tr></table></div></figure>


<!--more-->


<p>You might get an 1126 error and a message saying “Can’t open shared library..”, then you most likely need to set the plugin_dir variable in my.cnf and give MySQL a restart.
If you’re using a master/slave pair, you obviously won’t need to load both modules as above. You load the slave one on your slave, and the master one on your master. Once you’ve done this, you’ll have entries for these modules in the mysql.plugin table.
When you have confirmed that you do, you can safely add the pertinent variables to your my.cnf, the values I used (in addition to the normal replication settings) for my master/master sandboxes were:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>plugin_dir=/opt/mysql-5.5.6-rc/lib/mysql/plugin/
</span><span class='line'>rpl_semi_sync_master_enabled=1
</span><span class='line'>rpl_semi_sync_master_timeout=10000
</span><span class='line'>rpl_semi_sync_slave_enabled=1
</span><span class='line'>rpl_semi_sync_master_trace_level=64
</span><span class='line'>rpl_semi_sync_slave_trace_level=64
</span><span class='line'>rpl_semi_sync_master_wait_no_slave=1</span></code></pre></td></tr></table></div></figure>


<p>Note that you probably won’t want to use these values for _trace_level in production due to the verbosity in the log! I just enabled these while testing.
Also note that the timeout is in milliseconds.
You can also set these on the fly with SET GLOBAL (thanks Oracle!), just make sure the slave is stopped before doing this, as it needs to be enabled during the handshake with the master for the semisync to kick in.</p>

<p>The timeout is the amount of time the master will lock and wait for a slave to acknowledge the write before giving up on the whole idea of semi synchronous operation and continue as normal.
If you want to monitor this, you can use the status variable Rpl_semi_sync_master_status which is set to Off when this happens.
If this condition should be avoided altogether, you would need to set a large enough value for the timeout and a low enough monitoring threshold as there doesn’t seem to be a way to force MySQL to wait forever for a slave to appear.</p>

<p>If you’re running an automated failover setup, you’ll want to set the timeout higher than your heartbeat, so ensuring no committed data is lost. Then you might also want to set the timeout considerably lower initially on the passive master so that you don’t end up waiting on the master we know is unhealthy and have just failed over from.</p>

<p>Before implementing this in production, I would strongly recommend running a few performance tests against your setup as this will slow things down considerably for some workloads. Each transaction has to be written to the binlog, read over the wire and written to the relay log, and then lastly flushed to disk before each DML statement returns. You will almost definitely benefit in batching up queries into larger transactions rather than using the default auto commit mode as this will increase the frequency of the steps.
Update: Even though the manual clearly states that the event has to be flushed to disk, this doesn’t actually appear to be the case (see comments). The above still stands, but the impact may not be as great as first thought</p>

<p>When I find the time, I will run some benchmarks on this.</p>

<p>Lastly, please note that this is written while MySQL 5.5 is still in release candidate stage, so while unlikely, things are subject to change. So please be mindful of this in future comments.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[GlusterFS init script and Puppet]]></title>
    <link href="http://northernmost.org/blog//glusterfs-init-script-and-puppet/index.html"/>
    <updated>2010-08-09T08:08:14+01:00</updated>
    <id>http://northernmost.org/blog//glusterfs-init-script-and-puppet/glusterfs-init-script-and-puppet</id>
    <content type="html"><![CDATA[<p>The other day I had quite the head scratcher. I was setting up a new environment for a customer which included the usual suspects in a LAMP stack spread across a few virtual machines in an ESXi cluster.
As the project is quite volatile in terms of requirements, amount of servers, server roles, location etc. I decided to start off using Puppet to make my life easier further down the road.</p>

<p>I got most of it set up, and got started on writing up the glusterfs Puppet module. Fairly straight forward, a few directories, configuration files and a mount point. Then I came to the Service declaration, and of course we want this to be running at all times, so I went on and wrote:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>service { "glusterfsd":
</span><span class='line'>    ensure =&gt; running,
</span><span class='line'>    enable =&gt; true,
</span><span class='line'>    hasrestart =&gt; true,
</span><span class='line'>    hasstatus =&gt; true,
</span><span class='line'>}
</span></code></pre></td></tr></table></div></figure>


<p>expecting glusterfsd to be running shortly after I purposefully stopped it. But it didn&rsquo;t. So I dove into puppet (Yay Ruby!) and deduced that the way it determines whether something is running or not is the return code of:
/sbin/service servicename status</p>

<p>So a quick look in the init script which ships with glusterfs-server shows that it calls the stock init function &ldquo;status&rdquo; on glusterfsd, which is perfectly fine, but then it doesn&rsquo;t exit with the return code from this function, it simply runs out of scope and exits with the default value of 0.</p>

<p>So to get around this, I made a quick change to the init script and used the return code from the &ldquo;status&rdquo; function (/etc/rc.d/init.d/functions on RHEL5)  and exited with $?, and Puppet had glusterfsd running within minutes.</p>

<p>I couldn&rsquo;t find anything when searching for this, so I thought I&rsquo;d make a note of it here.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Legitimate emails being dropped by Spamassassin in RHEL5]]></title>
    <link href="http://northernmost.org/blog//legitimate-emails-being-dropped-by-spamassassin-in-rhel5/index.html"/>
    <updated>2010-05-26T19:05:34+01:00</updated>
    <id>http://northernmost.org/blog//legitimate-emails-being-dropped-by-spamassassin-in-rhel5/legitimate-emails-being-dropped-by-spamassassin-in-rhel5</id>
    <content type="html"><![CDATA[<p>ver the past few months, an increasing number of customers have complained that their otherwise OK spam filters have started dropping an inordinate amount of legitimate emails.
The first reaction is of course to increase the score required to be filtered, but that just opens up for more spam. I looked in the quarantine on one of these servers, and ran a few of the legitimate ones through spamassassin in debug mode. I noticed one particular rule which was prevalent in the vast majority of the emails. Here&rsquo;s an example:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>...
</span><span class='line'>[2162] dbg: learn: initializing learner
</span><span class='line'>[2162] dbg: check: is spam? score=4.004 required=6
</span><span class='line'>[2162] dbg: check: tests=FH_DATE_PAST_20XX,HTML_MESSAGE,SPF_HELO_PASS
</span><span class='line'>...</span></code></pre></td></tr></table></div></figure>


<p>4 is obviously quite a high score for an email whose only flaw is being in HTML. But FH_DATE_PAST_20XX caught my eye in all of the outputs. So to the rule files:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>$ grep FH_DATE_PAST_20XX /usr/share/spamassassin/72_active.cf
</span><span class='line'>##{ FH_DATE_PAST_20XX
</span><span class='line'>header   FH_DATE_PAST_20XX      Date =~ /20[1-9][0-9]/ [if-unset: 2006]
</span><span class='line'>describe FH_DATE_PAST_20XX      The date is grossly in the future.
</span><span class='line'>##} FH_DATE_PAST_20XX</span></code></pre></td></tr></table></div></figure>


<p>Aha. This is a problem. With 50_scores.cf containing this:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>$ grep FH_DATE_PAST /usr/share/spamassassin/50_scores.cf
</span><span class='line'>score FH_DATE_PAST_20XX 2.075 3.384 3.554 3.188 # n=2</span></code></pre></td></tr></table></div></figure>


<p>there&rsquo;s no wonder emails are getting dropped! I guess this is a problem one can expect when running a distribution with packages 6 years old and neglect to frequently (or at least every once in a while) <a href="http://wiki.apache.org/spamassassin/RuleUpdates">update the rules</a>!</p>

<p>Luckily, this rule is gone altogether from RHEL6&rsquo;s version of spamassassin.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Control-groups in rhel6]]></title>
    <link href="http://northernmost.org/blog//control-groups-in-rhel6/index.html"/>
    <updated>2010-05-13T09:26:51+01:00</updated>
    <id>http://northernmost.org/blog//control-groups-in-rhel6/control-groups-in-rhel6</id>
    <content type="html"><![CDATA[<p>One new feature that I’m very enthusiastic about in RHEL6 is Control Groups
(cgroup for short). It allows you to create groups and allocate resources to
these. You can then bunch your applications into groups at your heart’s
content.</p>

<p>It’s relatively simple to set up, and configuration can be done in two
different ways. You can use the supplied cgset command, or if you’re accustomed
to doing it the usual way when dealing with kernel settings, you can simply
echo values into the pseudo-files under the control group.</p>

<p>Here’s a controlgroup in action:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>[root@rhel6beta cgtest]# grep $$ /cgroup/gen/group1/tasks
</span><span class='line'>1138
</span><span class='line'>[root@rhel6beta cgtest]# cat /cgroup/gen/group1/memory.limit_in_bytes
</span><span class='line'>536870912
</span><span class='line'>[root@rhel6beta cgtest]# gcc alloc.c -o alloc && ./alloc
</span><span class='line'>Allocating 642355200 bytes of RAM,,,
</span><span class='line'>Killed
</span><span class='line'>[root@rhel6beta cgtest]# echo `echo 1024*1024*1024| bc` &gt;
</span><span class='line'>/cgroup/gen/group1/memory.limit_in_bytes
</span><span class='line'>[root@rhel6beta cgtest]# ./alloc
</span><span class='line'>Allocating 642355200 bytes of RAM,,,
</span><span class='line'>Successfully allocated 642355200 bytes of RAM, captn' Erik...
</span><span class='line'>[root@rhel6beta cgtest]#</span></code></pre></td></tr></table></div></figure>


<p>The first line shows that the shell which launches the app is under the control
of the cgroup group1, so subsequently all it’s child processes are subject to
the same restrictions.</p>

<p>As you can also see, the initial memory limit in the group is 512M. Alloc is a
simple C app I wrote which calloc()s 612M of RAM (for demonstrative purposes,
I’ve disabled swap on the system altogether). At the first run, the kernel
kills the process in the same way it would if the whole system had run out of
memory. The kernel message also indicates that the control group ran out of
memory, and not the system as a whole:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>...
</span><span class='line'>May 13 17:56:20 rhel6beta kernel: Memory cgroup out of memory: kill process
</span><span class='line'>1710 (alloc) score 9861 or a child
</span><span class='line'>May 13 17:56:20 rhel6beta kernel: Killed process 1710 (alloc)</span></code></pre></td></tr></table></div></figure>


<p>Unfortunately it doesn’t indicate which cgroup the process belonged to. Maybe
it should?</p>

<p>cgroups doesn’t just give you the ability to limit the amount of RAM, it has a
lot of tuneables. You can even set swappiness on a per-group basis! You can
limit the devices applications are allowed to access, you can freeze processes
as well as tag outgoing network packets with a class ID, in case you want to do
shaping or profiling on your network! Perfect if you want to prioritise SSH
traffic over anything else, so you can comfortably worked even when your uplink
is saturated. Furthermore, you can easily get an overview of memory usage, CPU
accounting etc. of applications in any given group.</p>

<p>All this means you can clearly separate resources and to quite a large extent
ensure that some applications won’t starve the whole system, or each other from
resources. Very handy, no more waiting for half an hour for the swap to fill up
and OOM to kick (and often chose the wrong PID) in when customer’s applications
have run astray.</p>

<p>A much welcomed addition to RHEL!</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[boot loader not installed in rhel6 beta]]></title>
    <link href="http://northernmost.org/blog//boot-loader-not-installed-in-rhel6-beta/index.html"/>
    <updated>2010-04-10T12:01:52+01:00</updated>
    <id>http://northernmost.org/blog//boot-loader-not-installed-in-rhel6-beta/boot-loader-not-installed-in-rhel6-beta</id>
    <content type="html"><![CDATA[<p>Just a heads up I thought I’d share in the hope that it’ll save someone some
time, when installing RHEL6 beta under Xen, be aware that pygrub currently
can’t handle /boot being on ext4 (which is the default). So in order to run
rhel6 under xen, ensure that you modify the partition layout during the
installation process.</p>

<p>This turned out to be a real head scratcher for me, and initially I thought the
problem was something else as Xen wasn’t being very helpful with error
messages.</p>

<p>Hopefully there’ll be an update for this soon!</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[building hiphop-php gotcha]]></title>
    <link href="http://northernmost.org/blog//building-hiphop-php-gotcha/index.html"/>
    <updated>2010-02-21T11:17:51+00:00</updated>
    <id>http://northernmost.org/blog//building-hiphop-php-gotcha/building-hiphop-php-gotcha</id>
    <content type="html"><![CDATA[<p>Tonight I’ve delved into the world of Facebook’s
<a href="http://developers.facebook.com/news.php?story=358&amp;blog=1">HipHop</a> for PHP. Let me early
on point out that I’m not doing so because I believe that I will need it any
time soon, but I am convinced  that I without a shadow of a doubt  will be
approached by customers who think they do, and I rather not have opinions or
advise against things I haven’t tried myself or at least have a very good
understanding of.</p>

<p>Unfortunately I set about this task on an RHEL 5.4 box, and it hasn’t been a
walk in the park. Quite a few dependencies were out of date or didn’t exist in
the repositories, libicu, boost, onig, tbb etc.</p>

<p>Though, CMake did a good job of telling me what was wrong, so it wasn’t a huge
deal, I just compiled the missing pieces from source and put them in
<code>$CMAKE_PREFIX_PATH</code>. One thing CMake didn’t pick up on however, was that the
flex version shipped with current RHEL is rather outdated. Once I thought I had
everything configured, I set about the compilation, and my joy was swiftly
abrupted by this:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>[  3%] [FLEX][XHPScanner] Building scanner with flex /usr/bin/flex version
</span><span class='line'>2.5.4
</span><span class='line'>/usr/bin/flex: unknown flag '-'.  For usage, try /usr/bin/flex --help</span></code></pre></td></tr></table></div></figure>


<p>Not entirely sure what it was actually doing here, I took the shortcut of
replacing /usr/bin/flex with a shell script which just exited after putting $@
in a file in /tmp/ and re-ran <code>make</code>. Looking in the resulting file, this is
the argument flex is given:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>-C --header-file=scanner.lex.hpp
</span><span class='line'>-o/home/erik/dev/hiphop-php/src/third_party/xhp/xhp/scanner.lex.cpp
</span><span class='line'>/home/erik/dev/hiphop-php/src/third_party/xhp/xhp/scanner.l</span></code></pre></td></tr></table></div></figure>


<p>To me that looks quite valid, and there’s certainly no single – in that command
line.</p>

<p>Long story short, flex introduced <code>–header-file</code> in a relatively “recent” version
(2.5.33 it seems, but I may be wrong on that one, doesn’t matter). Unlike most
other programs (using getopt), it won’t tell you <code>Invalid option
‘–header-file’</code>. So after compiling a newer version of flex, I was sailing
again.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Development; just as important as dual nics]]></title>
    <link href="http://northernmost.org/blog//development-just-as-important-as-dual-nics/index.html"/>
    <updated>2010-02-13T17:38:56+00:00</updated>
    <id>http://northernmost.org/blog//development-just-as-important-as-dual-nics/development-just-as-important-as-dual-nics</id>
    <content type="html"><![CDATA[<p>There is a popular saying which I find you can apply  to most things in life;
“You get what you pay for”. Sadly, this does not seem to apply for software
development in any way. You who know me know that I work for a reasonably sized
hosting company in the upper market segment. We have thousands of servers and
hundreds of customers, so after a while you get a quite decent overview of how
things work and a vast arsenal of “stories from the trenches”.</p>

<p>So here’s a small tip; ensure that your developers know what they are doing! It
will save you a lot of hassle and money in the long run.</p>

<p>Without having made a science out of it, I can confidently say that at the very
least 95% of the downtime I see on a daily basis is due to faulty code in the
applications running on our servers.</p>

<p>So after you’ve demanded dual power feeds to your rack, bonded NICs and a
gazillion physical paths to your dual controller SAN, it would make sense to
apply the same attitude towards your developers. After all, they are carbon
based humans and are far more likely to break than your silicon NIC. Now
unfortunately it is not as simple as “if I pay someone a lot of money and let
them do their thing, I will get good solid code out of it”, so a great deal of
due diligence is required in this part of your environment as well. I have seen
more plain stupid things coming from 50k pa. people than I care to mention, and
I have seen plain brilliant things coming out of college kids’ basements.</p>

<p>This is important not only from an availability point of view, it’s also about
running cost. The amount of hardware in our data centers which is completely
redundant, and would easily be made obsolete with a bit of code and database
tweaking is frightening. So you think you may have cut a great deal when
someone said they could build your e-commerce system in 3 months for 10k less
than other people have quoted you. But in actual fact, all you have done is got
someone to effectively re-brand a bloated, way too generic, stock
framework/product which the developer has very little insight into and control
over. Yes, it works if you “click here, there and then that button”, the right
thing does appear on the screen. But only after executing hundreds of SQL
queries, looking for your session in three different places, done four HTTP
redirects, read five config files and included 45 other source files. Needless
to say, those one-off 10k you think you have saved, will be swallowed in
recurring hardware cost in no time. You have probably also severely limited
your ability to scale things up in the future.</p>

<p>So in summary, don’t cheap out on your development but at the same time don’t
think that throwing money at people will make them write good code. Ask someone
else to look things over every now and then, even if it will cost you a little
    bit. Use the budget you were planning on spending on the SEO consultant.
    Let it take time.</p>
]]></content>
  </entry>
  
</feed>
