Luis Villa: The highlight of this release is new patent language, modeled on Apache’s. We believe that this language should give better protection to MPL-using communities, make it possible for MPL-licensed projects to use Apache code, and be simpler to understand
I love that the Apache License, Version 2.0 is becoming effectively the preferred universal donor license for communities which care about patent protection.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Example" type="EXAMPLE_LIST" />
<xs:simpleType name="EXAMPLE_LIST">
<xs:list>
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:assertion test="$value mod 2 = 0" />
</xs:restriction>
</xs:simpleType>
</xs:list>
</xs:simpleType>
</xs:schema>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Example">
<xs:simpleType>
<xs:union memberTypes="MYDATE xs:integer" />
</xs:simpleType>
</xs:element>
<xs:simpleType name="MYDATE">
<xs:restriction base="xs:date">
<xs:assertion test="$value lt current-date()" />
</xs:restriction>
</xs:simpleType>
</xs:schema>
Since Apollo’s initial setup, I’ve made a few preference changes and additions including WindowShade X and a mouse.

For a long time, WindowShade was one of my favorite utilities. But somewhere along the way WindowShade fell off the OS update train with me. What made me look it up again? Minimization.
I’m not so fond of Snow Leopard’s Expose grid. I’ve gotten used to it. But Spaces and minimized windows have never played well together. Then I thought of WindowShade. It’s Minimize-In-Place in place feature is much cooler than the lame old dock.
If I’m going to play StarCraft, I might as well do it right. The M500 is serviceable. Not too fancy. I set the two thumb buttons to Select Previous Tab (⇧⌘[) and Select Next Tab (⇧⌘]); though for everyday usage, I’m happy with the trackpad. Especially with new three finger dragging gesture.
Using Subversion again. I’m surprised I didn’t have something like this defined previously:
# revert recursively from the current directory
svnrevert() {
svn revert --depth infinity .
}
Vertical window buttons? I don’t think so:
> defaults write com.apple.iTunes full-window -1
I have to send my MacBook Pro to Apple for service again, so it's time to review my list of Sensitive Data: Things to Delete and other preparation for giving up physical control of a Mac. Unfortunately last month my MacBook Pro completely died, and I didn't have a chance to do any of this. The Genius asked for my password, and I just laughed at her. She explained they'd probably replace the hard drive with a new install if they couldn't get in, and I said I'd deal with that, but suggested they just use the installer to reset the password to something they liked. As it turned out, they apparently decided not to bother -- I got the MBP back with some security settings changed, so perhaps Apple techs have a different tool that grants them access.
apple user, and make it an administrator. Give it a simple password (don't forget to write it on a note for the tech -- you don't want to wait a couple extra days while they ask for the password!).apple account.root if relevant):
~/Library/Keychains/~/.ssh/ (except authorized_keys)~/Library/Mail/; I don't do this -- I have a lot of mail, and it's not generally sensitive)sudo passwd root).If the motherboard has changed, the serial number & MACs will change.
According to “The Big Lie: Spying, Scandal and Ethical Collapse at Hewlett-Packard,” an authoritative account by the former BusinessWeek writer Anthony Bianco, Mr. Hurd was very involved in H.P.’s efforts to hunt down the leakers. After the scandal broke, he hijacked H.P.’s internal investigation, hiring an outside law firm and ordering it to report directly to him, instead of the board, which is the normal practice.
we dropped by a small calculator and electronics store the other day to buy a programmable calculator. The salesman's product knowledge, enthusiasm, and interest in us were striking and naturally we were inquisitive. As it happened, he was not a store employee at all, but a twenty-eight-year-old Hewlett-Packard (HP) development engineer getting some first-hand experience in the users' response to the HP product line. We had heard that a typical assignment for a new MBA or electrical engineer was to get involved in a job that included the practical aspects of product introduction. Damn! Here was an HP engineer behaving as enthusiastically as any salesman you'd ever want to see.
After having shared my thoughts on how to improve focus and how to track tasks eating up time this post will explain how to keep time invested at a more or less constant level. The goal of this exercise is to keep obligations at a reasonable level - be it at work or during ones spare time.
In recent time I have collected a small set of techniques to reduce what gets to my desk - I don’t claim this list to be exhaustive. However some of it did help me organise conference and still have a life besides that.
Sharing and delegating are actually two different ways of integrating other people: Sharing for me means working together on a topic. That could be as easy as setting up a CMS or it could be more involved as in publishing articles on Lucene in some magazine. The advantage is that both of you can contribute to the task, possible even learn from each other: When I was doing the article series on Lucene together with Uwe it also was a great learning experience for me to have someone take the time to explain to me - well, not only to me - what flexible indexing, local search and numeric range queries are really all about, as in technically implemented. So it was not only an enormous time-saver for me, as the alternative would have been me reading through documentation, code and mailing lists to get up to date. But it also gave me the unique opportunity to learn from the very developers of these features about how they work and how they are meant to be used.
The disadvantage of sharing is that part of the work still remains on your desk. That’s where delegation helps: Take the task, find someone who is capable and willing to solve it and give it to them. There are two problems here: First you have to trust people to actually work on the task. Second you probably cannot avoid checking back from time to time to see if there is progress, if there are any impediments etc. So it means less work than with sharing. But there is more risk in not getting your results and more work to be done for co-ordination. However it is a very powerful technique if applied correctly to scale what can be achieved: Telling people what you need help with and letting them take over some of that work does scale way better than micro-managing people or even trying to be part of every piece of a project. It means giving up some of your control, in return you can turn to other potentially more involved tasks. Note to self: Need to build up more trust in that area.
Both concepts however are not actually about saying no but about being able to say yes even if you already have just very few time left.
Prioritising tasks can be done on a scale from zero to any arbitrarily large number. Obviously it helps with deciding whom to say no to: It’s going to be those projects rated very low. That is those you could easily do without That’s the simplest case as it is easiest to explain. The strategy I usually use is to be honest with people: If there are conflicting conferences, it’s easy to reject invitations. If some publication does not pay for you, it’s easiest to be open and honest with people and tell them. Usually they will understand.
A second reason for a rating of zero is that the task is one of those “Does not belong on my desk” tasks. My advice for those would be to get rid of them as quickly as possible: They draw away your energy without giving back any value. This issue plays nicely with the “patches welcome” theme from open source: People working on open source projects are most successful if they are driven by their own needs. So if you want something implemented, either implement and submit it yourself - or find someone you can pay to do so. People will not work for you. You can jump up and down, complain on the mailing lists - but if the feature you would like to see is something that no-one else in the existing community needs, it won’t get done until someone needs it.
A nice way of rejecting favours that works at least sometimes is to raise the barrier. The example here would be getting an invitation to give an introductory talk for a closed audience. So what I tried was to raise the bar by asking for funding for travel and accommodation.
Keep in mind though that there is the risk that the one inviting you actually accepts your conditions - no matter how high you think you have set them. Especially the example given above has the problem of being too low a bar in most cases. So be prepared to have to keep your promise. As a result the conditions you set really should lead to the task turning into something that is fun to do.
Imaging you have committed to some task. Later on you realise you won’t actually make it: You have no time, priorities have changed, the task is too involved or any other reason you could potentially imaging.
The important way to reduce the load on your desk is to communicate this issue as early as possible. It’s clear that people will be more disappointed the later they learn that something they probably depend on won’t arrive in time or will never happen: They’ll never be extremely happy, however the sooner they learn the more time they have on their part to react. And actually, most people don’t react that disappointed at all, simply because they may have counted some risk into the equation when giving you the task - which is not to say you should lower the reliability of your commitments, simply because no-one is expecting you to meet your goals anyway. However usually the amount of trouble expected is way higher than what actually happens. Second note to self: Don’t forget about this option.
At least in open source: If it’s nothing that helps make your world better - there are other people out there to help out. Patches being welcome may seem obvious. However in some areas it really is not: If someone asks the project member to be present at some conference, he may himself not consider himself capable of representing the project or even just making an impact by talking to people about it. That is the point where to encourage people that any input is welcome - not only code, but also documentation, communication and marketing work.
Of course as with any Pattern there are boundaries when not to apply it or when applying it would mean too much effort or loss. If that is the case and you have committed and cannot step back, than you should think about what could be a great reward if you went through the tasks: What would it take to make you happily comply and still gain energy through what you are doing? Basically it isn’t about doing what you like but about loving what you do (L. Tolstoi).
There is also valuable advice on managing ones energy from the Apache Software Foundation that is specially targeted at new committers. If you have not done so yet take the time to read it.
Game On
: exhibition billing itself as “the world’s biggest celebration of games”, arrives in Dublin on Sep 20 at the Ambassador, on tour from its home in The Barbican Art Gallery in London. ‘Enjoy a totally interactive experience with rare memorabilia and play your way through over 100 playable games from the arcade classics to the latest releases.’ tix are EUR10
(tags: games gaming exhibitions dublin)
Your Country, Your Call, You’re Doomed
: Bock on the predictably-crap biz-waffle results from the YCYC “get Ireland back on track” competition. ‘If we don’t take this seriously, we’re doomed to repeat the current economic disaster over and over again, each generation with its own Bertie Ahern, its own Seanie Fitzpatrick, its own Fingers Fingleton, and all the other assorted, integrity-free panhandlers and parasites who have soiled the reputation of this country and sold us down the Swanee for their own, ignorant, self-serving enrichment. Forget about Eamon Ryan’s smart economy. Let’s put all our effort into creating the Honest Economy.’
(tags: ycyc waffle business ireland vision-lock integrity bock-the-robber economy)
Pollsters called me this morning. Reputable ones (for what that’s worth): Ipsos Mori. Just in case it’s feeding into anything that matters, I agreed to answer their questions.
She started by asking about walking: have I walked anywhere for at least five minutes in the past four weeks? Good grief, how can you possibly not do that, unless you’re stuck in a wheelchair! Thirty minutes? Yes, of course. How often? Every day! For the first time, a question where a different answer is at least thinkable.
Then she moved on to cycling (yes I do, though not every day). Leisure or utility? Well, tends to be leisure these days, since I work from home and shop on foot. Any other sporting activities? Yes, swimming. Where? Our local rivers.
Do I take part in any organised events – no. Have I had or given any tuition – no. Am I satisfied with facilities in my local area? Hmmm, how can I not be satisfied when I have no expectations of them? I choose to live in an area with open moorland, big hills and nice rivers precisely because it has those things! Yeah OK I’m satisfied (with reservations about how that answer might be spun if Vested Interests are involved).
Further questions about cultural interests: have I visited a museum, gallery or exhibition in the past year (well, er, yes, a year is a long time to go without). A library? Probably not: I use the ‘net these days. Theatre, Concerts, performance events? Yes, well, I sing every week, and enjoy other people’s performances.
Finally some demographic questions about me, and she revealed this was a survey commissioned by Sport England. OK, there’s the vested interest: someone wants to justify their own existence and jobs. Dammit, no matter what the results, they’ll spin it: “all those people love sports, give us lots of money”, or “we need lots of money to get all those couch potatoes up and doing something”. Hmmm …
A quick google reveals the survey is here, and is a big, multi-year event. This Sport England isn’t someone reacting to the prospect of austerity: they’ve been engaged in self-justification since at least 2006, as revealed in their pages about it.
This quango sounds like a very good target for a 100% cut.
$ sudo vi /lib/udev/rules.d/50-udev-default.rules
# libusb device nodes
SUBSYSTEM=="usb", ENV{DEVTYPE}=="usb_device", MODE="0666"
I spent almost all day yesterday and today working on painting my son's room. You can see the fruits of my labor here.
I'm a little disappointed with the Apache. I think by that time I was just so tired that I didn't have the same attention to detail. I'm very pleased with the mustang and the piper, and the airliner is pretty awesome, too.
Hit The Road: Public Transport Directions for Dublin
: ‘a public-transport route-planning service for Dublin city, which shows you how to get from A to B using buses, Luas or DART services. The original version was built during the first Startup Weekend Dublin in May 2010.’ Pretty good; although in my tests it wasn’t able to find the optimal route, it always came up with something that made a good starting point
(tags: routes dublin public-transport buses hit-the-road)
o2 Broadband Dongle Working on Ubuntu 10.04
: the Huawei E1752 dongle does that horrible thing where it defaults to acting as a USB storage device containing the Windows drivers, instead of acting as a 3G modem by default. usb_modeswitch should be in the Ubuntu base install to deal with this crap
(tags: usb broadband o2 hauwei e1752 ubuntu dongle)
Strategically and operationally, I think there is a huge difference between drivers and passengers that comes out when they are placed in a new situation. When placed in their next company:
- Drivers assess the situation and develop strategies and tactics appropriate for the new reality.
- Passengers do what worked last time.
I am repeatedly stunned by the number of otherwise very intelligent people who show up and do what worked last time. Often with the very same cohort / entourage with whom they did it.
Different terms for different investors is clearly the way of the future. Markets always evolve toward higher resolution.
We’ve often claimed that opening up development of a project can help in its long term sustainability. By allowing new funders and participants to take an active role, even leadership, in a project it is possible to survive the natural coming and going of project participants.
Today I added the following update to the OSS Watch sustainability case study on Apache Cocoon:
Activity on the project has slowed considerably since its heyday. However, development continues despite the departure of a significant number of community leaders. It can therefore be argued that Cocoon validates the community model of software development as described in this document.
Steve Lee, our accessibility expert, has been working with a team at the University of Southampton to open up a cross browser ToolBar designed to help make the web more accessible. It’s a great project that allows users to control the way a page is displayed, invoke a text to speech reader, spell check editable content, look up dictionary definitions and extract reference information (amongst other things). Although the tool is an accessibility tool many of its features are of much more general use, Lifehacker said the work brought “something long overdue for web users.”
Steve helped the team open source the project and tried to work with TechDis to explain the benefits of collaborative development, in particular the ability to spread the cost (and risk) of development across multiple partners. Steve spoke about this with the H Online at our TransferSummit back in June:
Lee told The H that the tool, developed as an open collaboration between JISC TechDis and University of Southampton’s School of Electronics and Computer Science, was created to replace a previous toolbar … Lee said the open development process … has allowed the project to be more sustainable.
With the support of both TechDis and Southampton the ToolBar has been getting plenty of attention and use. Nobody can call it perfect but it is certainly useful. Furthermore, since it is open source others can help improve it.
Despite the success of the ToolBar in terms of raw use figures Sal Cooke, Director of Techdis, recently announced the demise of the ToolBar. She said that TechDis were “delighted by the response and the positive feedback we’ve had from users” and that the “number of downloads has surpassed all expectations.” So why kill the project?
Sal goes on to say “many of you will be aware that we [TechDis] have undertaken a major overhaul of our own website, with a commitment to embedding within it, a set of new accessibility tools.” Here Sal appears to be saying that TechDis no longer has a need for the ToolBar in addressing the accessibility needs of their own site users. Sal goes on to say, “in view of the above and the current economic climate, we have taken the decision to discontinue further development of the JISC TechDis Toolbar in favour of channelling resources into areas where we can make the most impact.”
On the surface this looks just fine, Techdis have not invested beyond the initial pilot funding and if TechDis have an alternative solution available to them then why should they pay again to support the ToolBar?
However, for me this misses one of the most important advantages of this work. As an open source project it is not just useful for TechDis, it is useful for every web user and every website developer.
So what about the rest of us? How can we address the accessibility needs that the toolbar tackled?
Fortunately, for us, the ToolBar has never been an in-house TechDis development, despite what TechDis may think. It is an open source development managed by the University of Southampton, Dr. Mike Wald followed Sal’s mail saying:
Although the toolbar was initially funded by Techdis and we provided a ‘Techdis badged version’ for them, the toolbar is an Open Source Project and my team at Southampton University are continuing to develop it …
The point here is that whilst TechDis (rightly) considered the TechDis branded version of the software as their own, the project is an open source one and can therefore be modified and distributed by anyone. To date all of the “quarter of a million uses of the toolbar” have carried the TechDis logo in recognition of their support of the project, but the future of the project is not dependent on TechDis.
OSS Watch are working with the Southampton team on a number of initiatives and we are pleased to report that we have been asked to help ensure the ToolBar continues to survive. I’m certain that it will and can only repeat Steve’s words from June:
they [Southampton ECS] are undoubtedly a group to watch as they steadily increase their portfolio of widely applicable open accessibility projects
Take a look at this great accessibility project and help the Southampton team by reporting any bugs you find, suggesting new features or even contributing code. The ToolBar will live on under a different name (to be decided ).
One thing that many people don’t realise, is that the distinction between paragraphs and line breaks isn’t unique to HTML. In fact, it’s a distinction that people have been working with quite happily for a very long time – Microsoft Word has supported the two concepts right from it’s very first version. With the more recent versions of Word, the default paragraph spacing is zero, so line breaks and paragraphs look the same, even though there is an important semantic difference. You can still see the difference if you show the non-printing characters:
The first line is a paragraph, note that it ends with the invisible paragraph marker character (¶). The second line ends with a line break character instead (¬) so the end of the line is not the end of the paragraph. There are actually only two paragraphs in this document. You can see this in action when we position the caret at the end of the text and apply a heading style.
The heading style is a paragraph level style, so it applies to the entire paragraph, even though there’s a line break. In fact, if you save that original document as HTML, you get the structure:
<p>Paragraph 1</p> <p>First line of paragraph 2<br> Second line of paragraph 2</p>
Just like any good HTML editor would create. You can try this yourself in Word, typing enter will create a new paragraph, typing shift-enter will instead insert a line break. The same keystrokes apply to most HTML editors as well.
A few days ago I talked about the Email and P Myth, but didn’t explain why it’s so frustrating for editor developers that people keep wanting to use BR tags instead of P tags. It’s not actually because we’re fastidious about using the correct semantic HTML, though obviously we do recommend that, it’s because a lot of concepts that user’s expect simply aren’t possible to implement sensibly if you don’t use the correct paragraph level elements.
Take for example, a simple function like alignment. In Word and any other sensible editor, when you click the right align button, the current paragraph becomes aligned right. Which pretty clearly highlights the problem – if you don’t have paragraphs, what should become aligned? Typically what happens is the entire document, being one paragraph, gets aligned right and user’s unsurprisingly complain. So why not just change the alignment of the current line? Think about it in HTML terms, if you have:
Line 1<br> Line 2
Where do you apply the text-align style exactly? You can’t apply text-align to inline elements so:
<span style=”text-align: right;”>Line 1</span><br> Line 2
doesn’t work. Further, we’ve been told that inserting P tags isn’t allowed, so we can’t just make it:
<p style=”text-align: right;”>Line 1</p> Line 2
because that would introduce “extra whitespace”1. So the only option is to insert a DIV tag which makes the HTML structure quite complex. DIVs aren’t just a meaningless element, they add structure to the document and the editor can’t tell which DIVs are important structure and which are just pretending to be paragraphs2.
This problem crops up in a huge number of different places – applying headings, style classes, lists, indenting and more. All these special cases need to be handled, adding to the download size for the editor and reducing performance – not to mention taking up a huge amount of development time that could have been put into something more useful.
WYSIWYG editors do such a good job of hiding the way that HTML and CSS work, that people often forget that in the end we’re still trying to generate high quality, standards-compliant HTML output. We’re limited by what those standards and browsers can actually do, so removing important concepts like paragraphs makes it incredibly difficult to create an editor that works intuitively.
Ultimately, it’s not that we developers don’t want to support your use case, it’s just that the restrictions you’re asking us to abide by dramatically reduce our ability to deliver a high quality authoring experience.
1 – which can and should be removed using CSS in the first place ↩
2 – remembering that what was once an unimportant DIV may become important at some point because of a change to scripts or CSS that the editor isn’t aware of ↩
Back when git and GitHub were relatively new to the mainstream, there was a big discussion about how it promoted forking and was potentially bad for community building. Since then GitHub has well and truly proven that it can successfully support significant community and very successful projects. Looking around GitHub though, it’s clear that not every project successfully builds community there and the tendency to fork can still become a problem.
The big difference that I see between projects thriving with GitHub and those where forking looks like more of a problem is that thriving projects have a “campfire” other than the source code for people to gather around. Often that’s a mailing list, other times it’s IRC and various other technologies but there’s always one common place where the community can build. GitHub by itself is unlikely to build up such a community because there isn’t a really good way to engage everyone in conversation. As such, you get a lot of forks and people making improvements, but not really working together and often the various forks never get folded back into the original.
The good news is that despite all the forks often remaining off on their own, it doesn’t seem to be detrimental to the original project. People find the main GitHub repository through other means, mostly links on the web, so there doesn’t seem to be confusion over which fork is the “main” one. On the other hand, most of those forks represent a wasted effort which could have pushed the project forward if only there had been some communication.
Finally, it’s interesting that the GitHub model of anyone creating a fork is far more successful for projects which don’t need a contributor agreement to be signed. In that case, any fork that looks promising can be pulled in without needing extra permissions, so even without the extra communication channel the effort can be utilised. With a contributor agreement, you have to reach out and get the author to sign, which is a significant barrier.
Lets say you are writing your new awesome web application in Node.js, because you know, Node.js is the new hotness and awesome.
Lets also say, your new Node.js web application does non-trivial things, and hits a limited backend resource. You can’t rewrite this backend system in the new hotness of async Node.js yet, so it can only handle 10 concurrent clients. This should be a very common situation unless you happen to be at a new startup and are green fielding your entire application stack. If thats your case, Lucky you! But for everyone else, you need to control the amount of concurrency that Node.js applies to your backend.
I am going to examine the simplest case of an HTTP server, which hits a backend resource, transforms it, and returns it to the client. Oh yeah, I mean a Reverse Proxy server; A Proxy server isn’t all that different from most application servers, taking client input and hitting different backends. A proxy server just happens to hit HTTP most of the time, while an application server hits a database or another resource, but lets not get too deep into that. Node.js just happens to make writing proxy servers very easy, and relatively short, so I like using it as an example.
All of the source code for the examples can be found inside my node-examples repository on github.
Here is our simple application server in Node.js, it takes in a client request, and proxies it to http://nodejs.org/, server_nolimit.js:
var http = require('http');
var sys = require('sys');
var destination = "nodejs.org";
http.createServer(function(req, res) {
var proxy = http.createClient(80, destination);
var preq = proxy.request(req.method, req.url, req.headers);
console.log(req.connection.remoteAddress +" "+ req.method +" "+req.url);
preq.on('response', function(pres) {
res.writeHead(pres.statusCode, pres.headers);
sys.pump(pres, res);
pres.on('end', function() {
preq.end();
res.end();
});
});
req.on('data', function(chunk) {
preq.write(chunk, 'binary');
});
req.on('end', function() {
preq.end();
});
}).listen(8080);
Note, this is in no way a valid HTTP 1.1 Proxy server, it breaks all kinds of things in the RFC, but its just bad enough for demos to work. It takes a client request in, creates an outgoing HTTP client request, and uses sys.pump to transfer the data.
It is functional, and if you threw 1000 concurrent clients at it, it would want to open 1000 connections to nodejs.org. Poor nodejs.org!
This might appear to be the simplest approach, keep a count of the active clients, and if we are over the limit, start queuing any new clients.
The problem comes in with how Node.js behaves; You will also need to buffer any incoming data while the request is queuing. You could try to call pause() on the stream, but this is only advisory, so streams can still trickle some data you need to buffer. This means our simple code of adding a counter, becomes complicated by extra buffering and work arounds for how Node’s HTTP streams work.
server_limit_clients.js implements this, it accepts the client request, and keeps a currentClients variable up to date with the number of active outgoing requests. When one request finishes, it starts processing the next one:
var http = require('http');
var sys = require('sys');
var destination = "nodejs.org";
var maxClients = 1;
var currentClients = 0;
var _pending = [];
function process_pending()
{
if (_pending.length > 0) {
var cb = _pending.shift();
currentClients++;
cb(function() {
currentClients--;
process.nextTick(process_pending);
});
}
}
function client_limit(cb, req, res)
{
if (currentClients < maxClients) {
currentClients++;
cb(function() {
currentClients--;
process.nextTick(process_pending);
}, req, res);
}
else {
console.log('Overloaded, queuing clients...');
_pending.push(cb);
}
}
http.createServer(function(req, res) {
var bufs = [];
var done_buffering = false;
client_limit(function(done){
var proxy = http.createClient(80, destination);
var preq = proxy.request(req.method, req.url, req.headers);
console.log(req.connection.remoteAddress +" "+ req.method +" "+req.url);
preq.on('response', function(pres) {
res.writeHead(pres.statusCode, pres.headers);
sys.pump(pres, res);
pres.on('end', function() {
preq.end();
res.end();
done();
});
});
function finishreq() {
bufs.forEach(function(buf){
preq.write(buf);
});
preq.end();
}
if (done_buffering) {
finishreq();
}
else {
req.on('end', function() {
finishreq();
});
}
});
req.on('data', function(chunk) {
var tbuf = new Buffer(chunk.length);
chunk.copy(tbuf, 0, 0);
bufs.push(tbuf);
});
req.on('end', function() {
done_buffering = true;
});
}).listen(8080);
I have to say, ew! It just got too complicated. This buffering yourself stuff sucks. In addition, if you had 1000 clients connect, most clients would just see a wonderful spinning spinner on their browser, waiting for their turn to get an outgoing client.
If your Node.js application is deployed behind a load balancer of some kind, it might be a better idea to provide it with back pressure to your load balancer, so it sends your application less traffic. A simple way to achieve this is to stop listening for connections.
When a socket in TCP is set to listen for incoming connections, the kernel keeps a backlog of pending connections, so while this method isn't perfect, under high load it will quickly push back to your load balancer to stop sending your application traffic. One problem is that currently Node.js hard codes 128 connections in the TCP listener backlog, so if your desired concurrency level is very low, this method will not be very effective in applying back pressure.
In addition because of how Node.js' IOWatchers work, even if you tell it to stop listening, it will continue processing any sockets it has already called accept on, meaning that this method is very crude, and relatively inaccurate on having exactly X clients making backend requests.
server_limit_incoming.js implements this, it reads new client requests until there are too many inflight, then it calls an internal function on the server instance's watcher object, stop(). This stops libev from listening for new clients on that socket, it removes it from the event loop. Once we are back below the maximum clients, it calls start on the IOWatcher, which adds the listening socket back to the event loop:
var http = require('http');
var sys = require('sys');
var destination = "nodejs.org";
var port = 8080;
var maxClients = 2;
var currentClients = 0;
var active = true;
var hs = null;
function activate() {
if (!active && currentClients < maxClients) {
hs.watcher.start();
active = true;
}
}
hs = http.createServer(function(req, res) {
var proxy = http.createClient(80, destination);
var preq = proxy.request(req.method, req.url, req.headers);
console.log(req.connection.remoteAddress +" "+ req.method +" "+req.url);
preq.on('response', function(pres) {
res.writeHead(pres.statusCode, pres.headers);
sys.pump(pres, res);
pres.on('end', function() {
preq.end();
res.end();
currentClients--;
activate();
});
});
req.on('data', function(chunk) {
preq.write(chunk, 'binary');
});
req.on('end', function() {
preq.end();
});
currentClients++;
if (currentClients >= maxClients) {
hs.watcher.stop();
active = false;
}
});
hs.listen(port);
Overall this approach of stop accepting new clients is far less code than the earlier limiting method, and it lets your load balancers do smarter things under high load, hopefully meaning you have extra capacity on another machine, rather than just accepting more clients on an already overloaded server. It does use some 'hacky' knowledge internals of how HTTP streams and IO Watchers work inside Node.js, and it is far less accurate in its counting. However, I believe that this method is probably one of the better ways to limit your concurrency inside Node.js.
My Beloved has painted several murals around the house, and this has somewhat raised the bar when it comes to painting the kids' rooms.
A while back I sketched up a few ideas for Z's room, and I finally got started on it. It's going to be blue on top, and green on the bottom, with various things on the various walls. The wall I've started on is airplanes, and if you squint just right, you can see two of my initial sketches here.
So far I've drawn a P52 Mustang, a A64 Apache, a Piper Cub, a Sopwith Camel, a Cessna 172 that I drew from memory and so isn't particularly accurate, and a view from the front of some kind of commercial jet coming off of the runway.
Later today, I plan to start painting them. I'd like to get the Mustang and the Cessna done, so that we can put up a shelf across the top of the wall.
More pictures as it becomes more recognizable.
Meanwhile, we also repainted S's room, to match her new quilt that my Beloved made for her. It looks awesome.
Messaging-related links of interest this week:










John Cleese shows his thoughts on creativity in a short video, as the excellent blog Presentation Zen showed me. He speaks out, what I think. You need silence and some place were your work isn’t interrupted at all – be it a park or a cafe. And yes, offices are usually not such a place. All the information overload will kill your efficiency – finally. It’s time to think again in germanys offices – at least a project manager should care. It’s his budget which will get burned. However, read the Presentation Zen article above or look at the video below. Highly recommended.
I have some vouchers for £5 off shopping at the new Tescos in Callington. Went today to see it (and spent one of them), leaving three more which I’m unlikely to use. Free to a good home if any local reader could use one.
The shop itself is a decent-size but not huge supermarket. Strongly in its favour were freedom from muzak, and decent trolleys. Against it, prices at the till that didn’t always match those advertised on the shelves, and staff who hadn’t a clue what to do about it.
Today, my son dropped something on my keyboard. Off popped the 'p' key. The machine is out of warranty, so my only option is to knuckle down and fix it. After some period of fiddling, I'd worked half out, but was far from sure what I was doing with these surprisingly flimsy pieces of plastic.
There's a long-standing contention that open software developers aren't good at producing user documentation. (Actually, the contention is more along the lines of 'they suck at it.') I probably fall in the developer category for that purpose, although I'd like to think I'm a little better at writing the stuff than the mean. However, given my epic-in-my-own-mind battles with Texinfo and makeinfo, I wonder if there might be a bit more to it.
Namely, maybe sometimes developers don't write good user documentation because of their tools.
I know, I know -- it's a poor workman who blames his tools. But sometimes they are to blame. Think about it: what tools are used to write the good documentation? Microsoft Word, probably. Which a lot of open sourcerers won't use (or won't be caught using). What's left? Stuff like Docbook, Texinfo, raw HTML, OpenOffice Writer (which is pretty close to Word in many ways).. And, let's face it, many of those don't make writing simple -- even if the author has good user communication skills. So if it's so frustrating, what's the point? Back to writing code..
Or maybe I'm just peeved because I've been fighting with makeinfo for weeks and it's really frying my bacon.
After summarising some strategies for not loosing track of tasks, meetings and conferences in the last post, this one is going to focus on the retrospect on achievements. If at some point in time you have asked yourself “Where the hack did time go to?” - maybe after two busy weeks of work this article might have a few techniques for you.
Usually when that happens to me it’s either a sign that I’ve been on vacation (where that is totally fine) or that too many different, sometimes small but numerous tasks have sneaked into my schedule.
Together with Thilo I have found a few techniques helpful in dealing with these kind of problems. The goals in applying them (at least for me) have been:
After hearing about Scrum and its way of planning tasks I thought about using it not only for software development but for task planning in general. Scrum comprises some techniques that help achieving the goals described above:
To track what got done during the past week, we use a whiteboard as Scrum Board. Putting tasks into the known categories of todo, checked out and done. That way when resetting the board after one week and adding tasks for the following week it is pretty obvious which actions ate up most of the time. The amount of work that goes onto the board is restricted to not be larger than what got accomplished during the past week.
So what goes onto the whiteboard? Basically anything that we cannot track as working hours: The Hadoop Get Together can be found just next to doing the laundry. Writing and sending out the long deferred e-mail is put right next to going out for dinner with potential sponsors for free software courses at university.
Now that weekly time tracking is set-up - is there a way to also come up with a nice longer term measure? Turns out, there are actually three:
First and most obviously the whiteboard itself provides an easy measure: By tracking weekly velocity and plotting that against time it is easy to identify weeks with more or less freetime. As a second source of information a quick look into ones calendar quickly shows how many meetings and conferences one attended over the course of a year. Last but not least it helps to track talks given on a separate webpage.
It helps to look back from time to time: To evaluate the benefit of the respective activities, to not loose track of the tasks accomplished, to prioritise and maybe re-order stuff on the ToDo list. Would be great if you’d share some of your techniques of tracking and tracing time and tasks - either in the comments or as a separate blog post.
Our installation tour is quickly coming to the end. Today with clean up by talking about three final command-line tools.
Today’s preferred version control system. Made it easy on myself: just installed the binary. Use Dropbox to keep .gitconfig synchronized:
> ln -s Dropbox/Preferences/tilde-slash/dot-gitconfig .gitconfig
My .gitconfig makes output simple and colorful. I use TextMate as my comment editor:
[user]
name = William Taysom
email = wtaysom@gmail.com
[color]
diff = auto
status = auto
branch = auto
[core]
pager = cat
editor = mate -wl1
My Ruby setup is not beautiful. I’ve been running 1.8.7 and 1.9.1. Ruby Version Manager is probably the best solution, but I’ll share my solution with you.
Ruby without other gems will never meet your jeweling needs:
> sudo gem update --system
> sudo gem update # Comes with Rails and a bunch of others.
> sudo gem install assert2 assit bundler capistrano-ext columnize fxruby gchartrb json json_pure linecache passenger rb-appscript redis rspec rspec-rails ruby-debug ruby-debug-base rubydbc echoe rubydbc yajl-ruby unroller
Some of these merit additional comment:
gchartrb provides a Ruby wrapper for Google’s chart API. Everybody needs a chart now and then.rspec is my preferred testing library. I like its lexical scoping goodness. Ruby testing the Lispy way.ruby-debug because hacking without a debugger is hacking in the dark. echoe is my preferred gem builder.unroller — why debug step-by-step when what you really want is a full execution trace?I share my IRB configuration via DropBox:
> ln -s ~/Dropbox/Preferences/tilde-slash/dot-irbrc .irbrc
What sort of configuration do I like? A simple prompt just ?:
IRB.conf[:PROMPT][:WLS_PROMPT] = {
:PROMPT_I => "? ",
:PROMPT_S => '" ',
:PROMPT_C => "| ",
:RETURN => "%s\n"
}
IRB.conf[:PROMPT_MODE] = :WLS_PROMPT
IRB.conf[:AUTO_INDENT] = true
IRB.conf[:USE_READLINE] = true
IRB.conf[:LOAD_MODULES] = [] unless IRB.conf.key?(:LOAD_MODULES)
Get tab to do its most:
## Tab for command completion.
unless IRB.conf[:LOAD_MODULES].include?('irb/completion')
IRB.conf[:LOAD_MODULES] << 'irb/completion'
end
Everyone needs gems:
## I just expect this.
require 'rubygems'
I define r as a method to reload a file of interest:
## For quick loading.
$wl_main_file_name = "main.rb"
def r name = nil
if name
name += ".rb" unless name =~ /.rb$/
elsif $wl_test_file
name = $wl_test_file
elsif Dir.new(Dir.pwd).detect {|n| n == $wl_main_file_name}
name = $wl_main_file_name
else
name = Dir.new(Dir.pwd).detect {|n| n =~ /.rb$/}
end
$wl_test_file = name
load $wl_test_file
end
Back when I started Ruby, Symbol#to_proc wasn’t standard:
## Symbol#to_proc
unless :symbol.respond_to? :to_proc
class Symbol
def to_proc
Proc.new {|*args| args.shift.__send__ self, *args}
end
end
end
Sometimes I want a quick summary of all the methods an object will respond to:
## ObjectSummary
# For quick inspection of an object.
module ObjectSummary
# Prints the public methods of an object grouped by module.
def _summary
header = lambda do |obj|
puts "=== #{obj} ==="
end
body = lambda do |array|
puts " "+array.sort.join(", ") unless array.empty?
end
sm = singleton_methods
unless sm.empty?
header["<< self"]
body[sm]
end
self.class.ancestors.each do |ancestor|
header[ancestor]
body[ancestor.instance_methods(false)]
end
puts "==="
end
def _s
_summary
end
end
include ObjectSummary
Sometimes I know the arguments, I know the return value, but I don’t know the method name:
## MethodFinder
# <http://www.nobugs.org/developer/ruby/method_finder.html>
class MethodFinder
# Find all methods on [anObject] which, when called with [args] return [expectedResult]
def self.find obj, res, *args
obj.methods.select do |name|
obj.method(name).arity == args.size
end.select do |name|
begin
mega_clone(obj).method(name).call(*args) == res
rescue
end
end
end
def self.mega_clone obj
begin obj.clone rescue obj end
end
# Pretty-prints the results of the previous method
def self.show obj, res, *args
find(obj, res, *args).each do |name|
print "#{obj.inspect}.#{name}"
print "(#{args.map{|o| o.inspect}.join(", ")})" unless args.empty?
puts " == #{res.inspect}"
end
end
end
# _why's addition
# <http://redhanded.hobix.com/inspect/stickItInYourIrbrcMethodfinder.html>
class MethodFinder
def initialize obj, *args
@obj = obj
@args = args
end
def ==res
MethodFinder.find @obj, res, *@args
end
end
class Object
def what? *args
MethodFinder.new self, *args
end
end
Though RSpec is now my testing framework of choice, once upon a time I found it useful to Test::Unit tests from within IRB:
## For testing.
def t
require 'test/unit'
require 'test/unit/ui/console/testrunner'
def Kernel.beep
putc ?\a
nil
end
def Test.run *names
u = Test::Unit
# collect tests in Test
tests = []
constants.each do |const_name|
const = const_get const_name
if const.kind_of? Class
if const.subclass? u::TestSuite
tests << const
elsif const.subclass? u::TestCase
tests << const.suite
end
end
end
tests.reject! {|t| not names.include? t.name } unless names.empty?
# build suite of all tests
suite = u::TestSuite.new "all tests"
tests.each {|t| suite << t}
result = u::UI::Console::TestRunner.run suite
beep unless result.passed?
tests.map {|t| t.name}
end
def t *names
Test.run *names
end
end
It’s amazing how the little utilities you make for yourself can become lost and forgotten.
I use sudo port install ruby19 got get ruby1.9, rake1.9, etc. Might not be the best way to manage multiple Ruby versions, but it works well enough for me.
Ruby 1.9 has its own set of gems:
> sudo gem1.9 update --system
> sudo gem1.9 install thin uuid rack-contrib yajl-ruby redis usher em-http-request activerecord sqlite3-ruby mysql rspec
> sudo mv /opt/local/bin/spec /opt/local/bin/spec1.9 # To avoid name conflict.
I like the speed and convenience of local documentation. Though the following document generation commands work, they don’t seem to work well. They take a very long time and seem a bit finicky:
> svn co http://svn.ruby-lang.org/repos/ruby/branches/ruby_1_8_7
ruby_1_8_7> rdoc --op ../ruby_1_8_7_doc
> svn co http://svn.ruby-lang.org/repos/ruby/branches/ruby_1_9_1
ruby_1_9_1> rdoc1.9 --op ../ruby_1_9_1_doc # Uses about 4GB of RAM. I sense a memory leak.
For language reference, I keep a local copy of Ruby QuickRef.
Though I’ve never used Haskell for a serious project, I do like reading programming language theory papers. Haskell is the lingua franca for pure functional programming. Haskell helps you distill the essential from the irrelevant. Paul Hudak once observed:
We provided them [DARPA] with a copy of P1 [implemented in Haskell] without explaining that it was a program, and based on preconceptions from their past experience, they had studied P1 under the assumption that it was a mixture of requirements specification and top level design. They were convinced it was incomplete because it did not address issues such as data structure design and execution order.
An addendum with all the cool things I’ve found since Apollo’s initial setup.
|
The Camel documentation has always been limited in that it reads much like a dictionary or a reference manual, each component or data format, for example, explained in its own page but with little narrative flow showing how the parts all form together. Thankfully, that problem has now been fixed by Camel Team members Claus Ibsen and Jonathan Anstey in their new Camel In Action book published by Manning. Note I reviewed only the late-stage MEAP edition; some further changes may occur to the book when its final version is released later this year.
While each chapter emphasizes a distinct topic--from routing to datatype transformations to beans to error handling to JUnit testing to Components onwards--when read sequentially they provide the necessary flow for a Camel newbie to see the whole picture of a Camel-based system. The authors are quite diligent in providing both Java DSL and Spring DSL configuration alternatives for nearly everything they teach, and refer often to the excellent Mavenized downloadable source code by listing simple "mvn test -Dtest=xxxx" commands that can be run to quickly show a concept in use. Later chapters cover Camel development using Maven and Eclipse and the various ways to deploy and run Camel-based projects. |
|
While not an introductory-level book, a beginner will nonetheless gain the most from this book, as nearly every page will offer something he didn't already know. However, intermediate developers will also have much to gain by learning additional ways of working with Camel, as the authors are pretty comprehensive in exploring the various alternative methods of routing, data transformations, transaction management, and the like. Advanced users may benefit most from specific chapters within the book--transactions, concurrency, and monitoring and management, for example. The Components chapter is fairly limited in what they can explain (and the authors admit as much), with 75+ Camel Components available the authors can cover only some of the most important. And I personally would have liked more meat in the CXF Component section, particularly with processing SOAP client calls. But I guess that's what tutorial blog entries, from me and others, are for.
Twitter’s misuse of OAuth
: Twitter seem to be attempting to control misbehaving clients, by using the “consumer key” pair as a secret key for app developers. This is proving impossible for FOSS clients to work with, and is trivially hacked to allow third-party app impersonation. Bad idea, Twitter
(tags: twitter fail oauth standards open-source gwibber security)
Boxee Blog » How Boxee Sees the Apple TV
: go Boxee! open TV is the way to go
(tags: boxee apple tv set-top)
“if slalom”
: a great name for a common “code smell” of too much indentation, calling for merciless usage of Extract Method (via Aman)
(tags: via:akohli code-smells refactoring if-slalom programming funny if-else indentation)
/~colmmacc/ » Prime and Proper
: algorithm to perform set membership tests on enumerated sets quickly and memory-efficiently, using multiplication by primes. Nice trick
(tags: hacks colmmacc prime-numbers set-membership bloom-filters bignums algorithms programming)
emerge -uD world. Unfortunately this will conflict with amarok when using the embedded USE flag. The reason is that mysql-5.1 doesn't support embedded really yet. You have two options:=virtual/mysql-5.1/etc/portage/package.maskembedded USE flag of amarok can't be used. Remove the embedded USE flag and re-emerge amarok. Then upgrade mysql. Connect Amarok to the MySql server and rebuild your collection.Last weekend, our team named “Ponies for Orphans” participated in the Node Knockout competition. The team included 3 of my co-workers from Cloudkick, Russell, Tomaz, Logan, and myself. In 48 hours, we had to build a project based on Node.js.
We were brainstorming ideas before the competition, thinking about all the cool things we could do; We even planned out some multiplayer game ideas. We quickly figured out that none of us had done anything extensive with Canvas or SVG, and the existing 3rd party libraries aren’t very comprehensive, with the possible exception of Processing.js. We also felt that we wanted something that would continue to be used after the competition. We refocused our ideas on projects that would work well with our team composition of being backend programers, and eventually settled on Nodul.es:
Nodul.es: CPAN for Node.js
Nodul.es is a web based view of the NPM package repository for Node.js. Our goal was simple, implement what we liked about CPAN for Perl and Python’s PyPi in 48 hours of coding.
Currently you can browse by:
Let’s look at an example of a module page; Tim Smart’s node-compress module is a good example. We pull out metadata from both the NPM repository, the latest commit from Github, and find all modules that have a dependency upon it.
Nodul.es is built around Node.js, using its asynchronous abilities extensively.
We split the system into 3 main components:
All of these services interact MongoDB, which provides data storage for all of the indexed data, and ways to get it back out for webpages.
We also used several external dependencies in building Nodul.es:
We built Nodul.es in 48 hours, and until the voting is over, we aren’t allowed to change it. But we have a ton of features partially completed that we had to pull because we didn’t want to ship broken and incomplete features, they include:





If you were wondering why I wasn’t updating this blog very often (or at all) during the last few months, I have an excuse: I was too busy writing a book!
Finally, Alfresco 3 Web Services, Build Alfresco applications using Web Services, WebScripts and CMIS has been published and I just received my complimentary author copies. You can order it, either as a PDF or as a dead-tree version, from the publisher’s website, or from Amazon.com.
This is the book you need if you are considering developing applications that need to use the services of the Alfresco Open Source ECM system, and it covers everything, from SOAP-based Web Services, to Web Scripts and REST, to CMIS. If you want to get a sample of the book’s content, you can download a sample chapter (Chapter 5: Using the Alfresco Web Services from .NET ) freely.
I have a couple copies that I could sign and give away. I will have to think of something, like a contest, who knows?
I would also like to thank my co-author and colleague, Piergiorgio Lucidi, for accepting to join me in this crazy enterprise and working hard to make it finally see the light of day.
Today the second article of my small series in the JavaMagazin appeared. While the first one was describing the risks at todays software projects, this one shows several strategies to avoid problems. There is a way for working long and hard and still stay happy and relaxed. So, check it out and let me know what you think.
This is mostly a test post to verify that this blog actually works again. I’ve made the switch from hosting it on a dedicated Linux machine to wordpress.com and in the process messed up with the domain registration, got uninterested, was distracted by work, hobbies, and vacations. As a consequence, this website was unreachable for a while, but it should be back to normal now. Let me know if you notice anything out of the ordinary.
Incidentally, I would like to recommend my ex hosting provider, Bytemark, to anyone who might need dedicated Linux hosting. I switched to WordPress.com simply because I had abandoned some projects that required a full Linux server, and decided to keep only a blog, for the time being. Bytemark has always provided me with excellent reliability and outstanding support, so I wouldn’t have switched for any other reason.
I’m flabbergasted!
Not that The Liar has published memoirs: we knew they were coming. Nor the mindboggling arrogance of those memoirs (at least as reported): again that’s as expected. But the chattering classes once again seem to give them credence. Or at least, to believe that he believes them. I mean, good grief, haven’t we learned from all those years of bitter experience?
I’d sooner take the Prince of Darkness’s word on the history of New Labour. Obviously not at face value, but Mandelson seems the more interesting and less megalomaniac(!!!) of the two. Perhaps more to the point, Mandelson has some presence in the real world as opposed to his own pure fantasyland. Or for a spot of plain speaking, reconstruct fragments from the working-class mascot John Prescott: he’s not articulate enough to lie Blair-style, and what he says (where sufficiently coherent) is at least likely to be what he means. Prescott has already rubbished what Blair says about Brown: I guess he’s too honest to let that pass when the meeja asked him.
Interesting historic question: could Brown have made a competent leader, if he hadn’t been driven (quite literally) mad by being number two to The Liar? I mean, back in the 1990s: it was clear by about the time of the second Labour term (2001) that Brown’s grasp was failing in some matters, and in retrospect he was evidently already quite mad.
One of the greatest frustrations for anyone who develops an HTML editor is the constant supply of people who are convinced they want to use BR tags instead of P tags. Most of these are just people who don’t want “double spacing” and they’re happy once you give them the magical CSS incantation:
p {
margin-top: 0;
margin-bottom: 0;
}
The other group however are people writing HTML emails who insist that P tags are completely incompatible with some email clients and cause rendering problems. At one point it seems that Yahoo! Mail really did strip them entirely but now it just applies a different default CSS styling (ironically, the magical CSS above that removes the extra spacing). So if you naively use P without specifying the padding you want, some clients will display a blank line between paragraphs, others, notably Yahoo!, will push all the lines immediately under each other. The solution is of course the opposite magical CSS incantation:
p {
margin-top: 0;
margin-bottom: 1em;
}
Solved right? Nope. This runs straight into the where the heck do I define styles? problem. In HTML, it should be:
<html>
<head>
<style>
p {
margin-top: 0;
margin-bottom: 1em;
}
</style>
</head>
<body>
…
</body>
</html>
However while this works in some clients, it has no effect in most. Instead, the common wisdom is to move the style tag into the body tag:
<html>
<head>
</head>
<body>
<style>
p {
margin-top: 0;
margin-bottom: 1em;
}
</style>
…
</body>
</html>
Which works almost everywhere. Enter GMail. GMail never respects the style tag, only inline styles. So now you need to write your paragraphs as:
<p style=”margin-top: 0; margin-bottom: 1em;”>…</p>
Thankfully you can use the margin shorthand if you know what you want the left and right margins to be as well:
<p style=”margin: 0 0 1em 0;”>…</p>
I would strongly recommend using embedded styles while editing and then just use post processing to inline all the styles – Premailer can do that for you.
As far as I can tell, there is no need to avoid P tags in email anymore and sampling a number of emails from various clients that happened to be in my inbox, they turned out to appear in emails from a few different clients though that’s far from scientific and it was still intermingled with a lot of <br> and <div><br></div> hacks. I would be very keen to hear from anyone who knows of an email client that cannot be made to render P tags correctly.
With a bit of luck we may be able to start moving away from the horrific abuses of <br> tags…
Packaging systems are great: like the iTunes App Store except free, accessed from the command line, and you get to feel like a big boy as packages are compiled locally. Not only that, but you get to brag when some dependency on Boost maxes out all of your cores — for an hour. Long ago Fink was package manager of choice, but MacPorts is the dish of the day.
I install it when I need it. My basic needs are easily met:
> sudo port selfupdate
> sudo port install ruby19
> sudo port install rlwrap
> sudo port install lua
> sudo port install spidermonkey
Allow me to describe the goods:
Other languages and tools: Git, Ruby, and Haskell.
A few month ago I started to play with Apache HBase, Apache’s Bigtable implementation. Since the Bigtable impl. from the Google AppEngine (->Datastore) can be accessed with JPA, I searched for a way to use HBase with JPA as well.
Fortunately there is an easy: The DataNuculeus plugin (used by Google AppEngine) does have support for Apache HBase as well.
Over the time I noticed that folks are searching for “JPA HBase” are coming to my blog. As of now there is a SIMPLE demo: A (JSF2-based) web application that uses JPA to talk to an existing HBase installation.
I did commit to the code to github.
On http://github.com/matzew/hbase-jpa-jsf I also added a little README that gives some more details on how to run the example on your machine.
Note: It is just a quick and simple example that mainly should act as a playground for those that are interested in this topic.
Enjoy!
요즘은 아내가 장거리 출퇴근을 한다. 그래서 피곤하거나 일이 많으면 기숙사에서 자고 올라오기도 한다. 고된 업무로 병을 얻거나 하지는 않을까 걱정스럽다.
혼자 밤을 보낸다는 것은 다소간의 불면과 함께 묘한 감정을 불러온다. 인정하기는 쉽지만 받아들이기는 어려운 불확실한 인생사에 심경이 복잡해지기 마련이다. 복잡한 마음에 기대 고민거리를 이리 저리 끄적여 본다. 하지만 털어놓으려고 하면 할 수록 사람들이 보통 하는 고민과 다를 바가 없음을 깨닫는다. 게다가 별다른 진전조차 없지 않은가. 소프트웨어를 개발할 때는 문제를 좀 더 잘 이해하고 다양한 해결책을 평가하기 위해 마인드 맵과 같은 도구를 이용해 왔다. 지금까지 그보다도 훨씬 복잡한 인생사에 아무 준비도 없이 겉도는 고민만 한 것은 아닐까. 단순히 어지러운 마음을 토로하는데 그치지 않고 진지한 자세로 끈기있게 고민하며 해결에 임해야겠다.It’s back to school time and for those with academic interests in machine learning, it’s great to see Apache Mahout is catching on in academic circles, in addition to commercial circles. The interest is no doubt due to its open code, active community and focus on real world machine learning techniques. The first class based on Mahout that I am aware of was Mahout committer Isabel Drost’s class at TU Berlin. Of course, with all due respect to Isabel, she is a bit biased towards Mahout; so I was pleasantly surprised when Dr. David Grossman (See his excellent Information Retrieval: Algorithms and Heuristics (The Information Retrieval Series)(2nd Edition) book, for starters) from the Illinois Institute of Technology contacted me last spring about putting together a class on Mahout for the fall at IIT. Well, that class has finally come to fruition as CS 422: Data Mining Course Homepage and it looks to have nice coverage of the things near and dear to Mahout: clustering, classification, pattern mining and recommenders along with the requisite theoretical underpinnings.
I’ve also heard from a few other Professors who are working on adding Mahout to their coursework and would love to hear from more. So, if you are teaching a class on machine learning and interested in Mahout for teaching purposes, either let me know (gsingers@apache.org) or drop a line to the Mahout community mailing list: user@mahout.apache.org.
69.63.181.11 www.facebook.com login.facebook.comsharkbait posted a photo:
My sister Clare's 70th birthday party in Bishop's Castle a friendly little town in Shropshire.
It has more than it's fair share of good local beer, excellent pubs and talented musicians.
This guitar has had quite a history, it was once owned by the Bee Gees.
env EDITOR=nano crontab -eChowder isn’t exactly rocket science, but this went pretty well, so documenting it here…
I actually made this almost entirely from frozen ingredients and it was just fine. Fresh might be better.
Finely chopped leek
Smoked bacon, sliced (I used some lardons I had in the freezer)
Cubed potatoes
Chicken stock (maybe fish stock would be better, I didn’t have any) or water
Milk (about half as much as stock)
Pepper
Mace
Cod
King prawns
Sweetcorn
Cream
Fry the leeks and bacon in a little butter/olive oil (I used both) until pretty soft – I didn’t crisp the bacon for a change. I think it is better for chowder not to. Add cubed potatoes and fry for a bit longer, then add chicken stock (or water or fish stock) and bring to the boil. Simmer until the potatoes have softened, then zap half the mixture with a blender (I just did this in situ). Season (I didn’t need salt, there was enough in the bacon). Add milk, fish, prawns and bring back up to a simmer, cook for a few minutes, making sure the fish falls apart. Add cooked sweetcorn and bring back up to temperature. Finally, add some cream.
Quantities should be chosen so that the final result is good and thick.
Serve with warm, crusty bread and butter. Works as a whole meal.
I’ve been using my iPhone 4 and iPad for several months now, so I thought I would give a hard real use experience report.
iPhone 4
I love the phone. I do see the much written about antenna attenuation problem, but day to day it doesn’t affect me as much as AT&T’s network does. One of the prime times for me to use my phone is while standing in line waiting for the ferry. The worst time is during the afternoon, because there are several hundred people all packed into the ferry terminal, all trying to pull data on their iPhones. The antenna has nothing to do with this.
In every other way, the phone is fantastic. My iPhone 3G would frequently hit the red line on the battery indicator by the time I hit the afternoon ferry, and that was after I had carefully managed my use of the device during the day. With the iPhone 4, I don’t have to worry about managing the battery. That alone has made the upgrade worth it for me.
The upgraded camera has been a huge success for me. I attribute this to a single factor – startup time. I was always reluctant to pull out my iPhone 3G for use as a camera, because quite frequently I would miss the moment by the time the camera came up. I’ve been using Tap Tap’s excellent Camera+ and I like it quite a bit. Unfortunately, you can’t get it on the app store right now, because the developer inserted an easter egg that would allow you to use one of the volume buttons to trigger the shutter. Apple then pulled the app from the store. This is the first time that App Store policy has affected an app that I care about, and I’m obviously not happy about it. It seems to me that Camera+ could have a preference that controlled this feature, and that users would have to turn it on. Since the user would have turned on that feature, they would’t be confused about the takeover of the volume button. It seems simple to me. I really like Camera+’s light table feature, but I really hate the way that it starts up trying to imitate the look of a DSLR rangefinder. The other area where Camera+ could use improvement is in the processing / filters area. It has lots of options, but most of them don’t work for me. I have better luck with Chase Jarvis’ Best Camera on this front. In any case, I’m very happy with the camera as ” the camera that is always always with me”. The resolution is also very good, and I’ve been using it to photograph whiteboards into Evernote quite successfully.
iPad
I’ve been carrying my iPad on a daily basis. I’m using it enough that when I forgot it one day, it made a difference. One thing that I’ve learned is that the iPad really needs a case. I got much more relaxed about carrying mine once it was inside a case. Originally, I thought that I would wait for one of the third party cases, but all of the ones that looked like a fit for me were out of stock, so I broke down and ordered the Apple case. It does the job, but I am not crazy about the material, and I wish that it had one or two small pockets for a pen, a little bit of paper, and perhaps some business cards.
I am pretty much using the iPad as my “away from my desk device” when I am in the office. Our office spans 5 floors in a skyscraper, and I have meetings on several floors during the course of a day. The iPad’s form factor and long battery life, make it well suited as a meeting device. I have access to my e-mail and calendar, and I’m using the iPad version of OmniFocus to keep my tasks and projects in sync with my laptop. I’ve written some py-appscript code that looks at the day’s calendar in Entourage and then kicks out a series of preformatted Evernote notes so that I can pull those notes on my iPad and have notes for the various events of the day. This kind of Mac GUI to UNIX to Mac GUI scripting is something that I’ve commented on before. Thanks to multi-device application families like Evernote, I expect to be doing some more of this hacking to extend my workflow onto the iOS devices. I don’t have a huge need for sharing files between the iPad and the laptop, but Dropbox has done a great job of filling in the gap when I’ve needed to share files.
Several people have asked me about OmniFocus on the iPad, and whether or not it is worth it. I have a large number of both work and personal projects, so being able to use the extra screen real estate on the iPad definitely does help. I have come to rely on several features in OmniFocus for iPad which are not in the desktop version. There is a great UI for bumping the dates for actions by 1 day or 1 week, which I use a lot. I am also very fond of the forecast view, which lets you look at the actions for a give day, with a very quick glance at the number of actions for each day of a week. Both of these features are smart adaptations to the iPad touch interface, and are examples of iPad apps coming into a class of their own.
Another application that I’ve been enjoying is Flipboard. Flipboard got a bunch of hype when it launched back in July, and things have died down because they couldn’t keep up with the demand. Conceptually, Flipboard is very appealing, but the actual implementation still has some problems as far as I am concerned. I can use Flipboard to read my Facebook feed, because Facebook’s timeline is just highly variable in terms of including stuff from my friends. I don’t feel that I can read Twitter via Flipboard, because it can’t keep up with the volume, so I end up missing stuff, and I hate that. Some of the provided curated content is reasonable, but not quite up to what I’d like. Flipboard is falling down because there’s not a good way for me to get the content that I want. I want Flipboard to be my daily newspaper or magazine app. But I can’t get the right content feed(s) to put into it.
As far as the iOS goes, my usage of the iPad is making me horribly impatient for iOS 4. I would use task switching all the time. Of course, then I would be unhappy because the iPad doesn’t have enough RAM to keep my working set of applications resident. Text editing on iOS is very painful on the iPad. I’m not sure what a good solution would be here, but it definitely is a problem that I am running into on a daily basis – perhaps I need to work on my typing. There is also the issue of better syncing/sharing. My phone and iPad are personal devices, so they sync to my iTunes at home. I use both devices at work, where I have a different computer. This is definitely an area that Apple needs to improve significantly. At the moment, though, the fact that I am using my iPad hard enough to really be running into the problem means that the iPad has succeeded in legitimizing the tablet category – at least for me.