Archive for July, 2008

Over a trillion unique URLs in Google (but you knew that already)


2008
07.26

On the Google Blog, they’ve been going on about the size of the Google index that’s grown at such great speed over the time of Google

The first Google index in 1998 already had 26 million pages, and by 2000 the Google index reached the one billion mark. Over the last eight years, we’ve seen a lot of big numbers about how much content is really out there. Recently, even our search engineers stopped in awe about just how big the web is these days — when our systems that process links on the web to find new content hit a milestone: 1 trillion (as in 1,000,000,000,000) unique URLs on the web at once!

Now, the whole thing about unique URLs they go on to qualify later, going on about a potential infinate quantity of links on the web, like ‘next month’ links and the like. Sure. But what abou the specifically google-tailored and subdomain cross-linked ‘SEO’ domains? I’m sure we can take out a margin of errors of 30% – 35% of auto-generated pages, links and domains.

But this example shows that the size of the web really depends on your definition of what’s a useful page, and there is no exact answer.

So they don’t index every page. Real content vs google-specific content…

I think it may be safe to simply agree that the index is big. Very big. With lots of lots of pages. There’s not much point in going on about the number.

ex Lightedge, Ends On… (but you knew that already)


2008
07.23

Just as a quick follow-up to the previous post about ex-Lightedge vs Ends On earlier this week, guess who I received an unsolicited email from today! Yup, you guessed it – seems like old mailing lists are being reused to drum up support for the new business… either those bought – or inherited? And if none of the two, that’s spam… but you knew that already.

Makes you wonder – what do you pack if you need to leave home (or the office…) in a rush? Seems like the lesson is to take along your base to sell from in future. Mere unfounded speculation, mind you…

Aaaaaaaaaaaanyway…

Ubuntu 8.04 Dual-screen


2008
07.22

From the bug-report there’s a simple solution to dual-screening on Hardy…

Users upgrading from Dapper Drake to Hardy Heron who have used Xinerama to support dual monitors now have to use a different method (xrandr I think). This isn’t mentioned in the release notes and the graphical tool for configuring dual monitors is under “Screen Resolution” in the System -> Preferences Menu. The point being that it is quite hard to find.

Users with multiple desktops appear to have add the following to the xorg.conf file “Screen” section:

SubSection “Display”
Depth 24
Virtual 4000 2048
EndSubSection

Then the screen resolution tool works fine for me.

Me too! Just to complete this (it’s all out there…):

Unfortunately, for creating dual-screen layouts there is still one manual configuration step required, which is to add a Virtual framebuffer size. The size needs to be equal or greater than the maximum combined size of your displays. For example, if you have two 1920×1200 monitors you wish to put side-by-side, you would add a Virtual line like this:

Section "Screen"
        Identifier      "Default Screen"
        Device          "Configured Video Device"
        DefaultDepth    24
        SubSection "Display"
            Depth           24
            Virtual         3840 1200
        EndSubSection
EndSection

Note that setting Virtual to larger than 2048×2048 disables 3d acceleration (i.e., no Compiz).

ex-Lightedge vs Ends-On (but you knew that already)…


2008
07.20

Interestingly, the top search result that pops up on the “Do No Evil” Search is ends-on.co.za (not the stories relating to the Valentine’s Day demise of Lightedge Technology). Go figure. The old lightedge.co.za domain is pointing to the same server as ends-on (at SA Domain) and doing a re-direct.

Hm. lightedge.co.za – paid on 2 June 2008 (still registered to Lightedge / Bobby Richter as Tech Contact at the time through UUNet / Verizon) and changed only on 24 June 2008 to repoint to the ends-on setup. Ends-on, for its own part was only registered on 2 March (to gapsoft@ our favourite telecoms provider’s network) and then updated two months later. Thus nabbing the google position.

Ends-On has lynnm (Lynn Munch’s ?) and grantp (Grant Poulton’s ?) details as primary contacts, and is privately registered in Grant’s name at the domain registry.

Even though there’s not much on the site (and the mailers have been going out), be sure to see some familiar product offerings on the site once the product listing gets updated… Then, too, can we see pricelists etc.

The 2006-registered company (shell, I guess, then renamed?) company has a  Micro-Enterprise Exemption giving it a BBBEE Level 4 Contributer and 100% Recognition score until May 2009 (which was issued as a Microsoft Word file!?)

Nothing wrong with that, free trade… But you knew that already…

Ursprung des Namens Welzel


2008
07.15

Welzel (Görlitz [17] Liegnitz [9] Grünberg [5] Glatz [9] Reinerz [10] Neurode [15] Neisse Beuthen [13]).

Deutsche Form für Welczek, das ist Velislav (Grafschaft!). Welczel Beler 1364 = Walczil Beler 1370 Breslau (Reichert, H., Die deut. Familiennamen nach Breslauer Quellen. Breslau 1908, S. 16); Welczel Beck 1354 Glatz; Welczel Kauffler 14./15. Jahrh. (Burdach, Vom Mittelalter zur Reformation. Bd. 9); Welczil als Familienname 1369 Liegnitz; Peter Welczel 1432 Glatz; Welczlinus (de) Kaczbach Liegnitz 1340 ff. = Petir Welczil von der Kaczbach, Priester 1370 Liegnitz; Veczentz Weltzel 1403 Görlitz. Beachte besonders Velcellinus Romung 1333 Fraustadt (! Codex dipl. Silesiae, Breslau 1857 ff., Band 22) und Welcelinus fllius Jenzingi (!) 1318 Jauer. Ohne Suffix: Welcz (B) u. Felz!, Feltz!

Hier gefunden ;) Na, wer sagt’s denn! :)

Worldwide DNS patch roll-out (and the slow, lagging one)…


2008
07.13

So the DNS patch rolled out on 8 July. But you knew that already. Vendor after vendor started popping out the woodwork, and Ubuntu mirrors slowly started updating worldwide to rollout updates to bind9, the glibc stub resolver etc… The original CERT announcement didn’t had many ‘no response’s from a wide variety of providers (having been alerted only a few days earlier)… When asked how the issue was addressed in his application, dnsmasq’s author Simon Kelly, for example, had this reaction:

Good question.
I wasn't contacted in advance about this, and no patch for dnsmasq has
been released. Since the exact nature of the new vulnerability has not
(as far as I know) been announced, I don't know if dnsmasq is vulnerable.
My current plan is to implement query-port randomization, and I'm
working on that right now. If all goes well, it will go into 2.43, and
be released ASAP. To help with this, I'd like to gather as many testers
as possible. The changes are quite intrusive, and to be confident about
releasing them quickly, I'd like to get as many people as I can testing.
Since query-port randomisation is potentially quite resource-heavy (it
needs a socket per query), and will break many firewall configs, the
current plan is to make it optional, and not the default behaviour.
Cheers,
Simon.

Microsoft also came to the party with MS08-37, their Windows patch of the problem. However, other vendors are going nuts now, having to issue repatched versions (think ZoneAlarm – ok, that is, if you’re using it) of their own proprietary software. Depending on predictable port? Mal-implemented patch(es) by the upstream provider?

The fix (not the attack vectors) are described as such, eg in DSA-1603-1:

“Dan Kaminsky discovered that properties inherent to the DNS protocol
lead to practical DNS cache poisoning attacks. Among other things,
successful attacks can lead to misdirected web traffic and email
rerouting.

This update changes Debian’s BIND 9 packages to implement the
recommended countermeasure: UDP query source port randomization. This
change increases the size of the space from which an attacker has to
guess values in a backwards-compatible fashion and makes successful
attacks significantly more difficult.” – Ubuntueque

Dan Kaminsky went on at length about the need and success of CERT in the roll-out of the patches across vendors. In South Africa, DNS went down to a crawl as SAIX, Verizon, IS etc patched their servers (OK, Verizon also switched its ATM link to SAIX to GigEthernet, so after veeeery slow, thing started to fly…) SAIX’s DNS servers took 2 days after release to go live and randomise and not publish the port.

Bind9, dnsmasq all are “affected” by this…

The problem with all this is the lack of information and uncertainty. OK, argue that this was in the interest of security. Don’t make the weakness know to protect the web. But *no-one* knew what the technical details were (and verify the patch, peer-reviewing the process) until he privately finally disclosed to Ptacek and Zovi… who agreed that all was above board

I guess that leaves three concurrent steps to take:

  1. Wait for BlackHat 2008
  2. Patch whatever servers are out there in anticipation
  3. Trust.

Kaminsky put it nicely in his blog:

So here’s the bottom line.  I think people don’t have enough information right now, to determine whether there indeed exists any context in which a huge press rush should occur with so few deep technical details.  When everything is on the table, I leave it to the community to judge whether we have gained or lost credibility through this effort.

But it’s clear that, in lieu of details, to not even have respected and completely independent members of the community vouching for your work cannot stand, no matter how respected you are in the community, no matter how many vendors are behind you, no matter what.  OK.  So that’s a fairly big lesson learned, in a process I’ve sort of been making up as I’ve gone along.  Thanks to Dino and Thomas for setting me straight.

Let’s see where this takes us…

Single-line key-less SSH remote login (but you already knew that)


2008
07.11
ssh-keygen -t dsa
cat ~/.ssh/id_dsa.pub | ssh youruser@example.com 'cat - >> ~/.ssh/authorized_keys2'

Choose no password. And you’re done…

Routing – but you knew that already


2008
07.09

From the RTFM

In this example, your client machine is connected to a firewalled LAN through ethernet device eth0. Its IP address is 12.34.56.78; its network is 12.34.56.0/24; its router is 12.34.56.1.

Your network administrator may have told you to use 12.34.56.1 as default router, but you shouldn’t. You should only use it as a route to the client side of the firewall.

Let’s suppose the client side of your firewall is made of networks 12.34.0.0/16 and 12.13.0.0/16, and of host 11.22.33.44. To make them accessible through your client router, add these routes to your global network startup script:

route add -net 12.34.0.0 netmask 255.255.0.0 gw 12.34.56.1
route add -net 12.13.0.0 netmask 255.255.0.0 gw 12.34.56.1
route add -host 11.22.33.44 gw 12.34.56.1

You must also keep the route to the client’s local network, necessary for linux kernel 2.0 and earlier, but but unnecessary for linux kernel 2.2 and later (that implicitly adds it during the ifconfig):

route add -net 12.34.56.0 netmask 255.255.255.0 dev eth0

On the other hand, you must remove any default route from your scripts. Delete or comment away a line like:

route add default gw 12.34.56.1

Note that it is also possible to remove the route from the running kernel configuration without rebooting, by the following command:

route del default gw 12.34.56.1

Just so that it’s all in one place :)

Search and replace string in multiple files (and skip if necessary)


2008
07.08

There are two easy ways here (well, more, but I’m only going for two):

  • Perl
    • quick and simple and easy syntax
    • easy as “pie” (get it?)
    • perl -p -i -e 's/PASSWORD/NEWPASSWORD/g' *.php
    • wrap it in a script for safekeeping for the future :)
  • sed
    • it was made for strings
    • got to check back for the syntax, but oh so powerful
    • find . -name .phtml -prune -o -exec sed -i 's/PASSWORD/NEWPASSWORD/g' {} \;
    • skip phtml files above, running on files from local directory downwards, replacing PASSWORD with NEWPASSWORD.

Don’t make your password PASSWORD. Or NEWPASSWORD, for that matter.
Ideally, don’t store your password in code, anyway. Use a bootstrap init file – or just store it in your config file; have the Framework pick up the appropriate details for the DB server you’re working on (but that’s another post…)

Stripping out illegal characters


2008
07.08

I keep forgetting it…

1
$safe_foldername = preg_replace('/[^a-zA-Z0-9]/','',$unsafe_foldername);

This, of course, supercedes the previous file name stripper – should be called instead of the replace…