Monthly Archives: February 2007

Blogging Is Meant To Be Personal

Like most people, I read a lot of web sites and the majority of them I read because of their insight and personal opinion on any given topic. Of late, there have been a flurry of web sites that are automatically generating posts based of content on other web sites; such as del.icio.us.

I’m all for aggregation of content, its a really useful utility – however I don’t come to someone’s site to see a list of links to other web sites. I come to someones web site to read about a topic and if it happens to link out, well that is fine and dandy.

If you want to aggregate content, put the aggregated content into your side bar or in any location other than your primary content space. You should reserve your main content space for your own content or personal opinion on something – not for an aggregation.

Generating primary content automatically based on another web site, feed or service is just impersonal; don’t do it.

Akismet, Friends Forever

Akismet Spam Filter, Caught & Nailed 80,010 Spam Messages In Five MonthsIn the last five months, I have posted twice about the wonders of the free spam filtering service Akismet.

Since the last installment, another ten weeks have pasted. In the last ten weeks, approximately 30,000 new spam messages have been received and all of them have been blocked in one way or another. Since installing Akismet back towards the middle of last year, it has now dropped a whopping 80,000 spam messages at the door and it feels great!

You know what I’d like, I’d like it if the spammers were a little more intelligent. Clearly I’m running something on my site that is blocking their spam from ever reaching the public. If I were a spammer, I’d be keeping a close eye on what web sites my spam bots submit to and what sites it is getting through on. Essentially, if they are just brute forcing thousands of web sites – they are not being efficient spammers. At the moment, they could be spending the majority of their time spamming sites that the spam will never reach; instead of focusing their energies on the sites that spam is actually being registered on.

This can’t be a new idea but it sure seems as though the spam is relentless, even though it never makes it onto my live site.

ORA-06502: PL/SQL: numeric or value error

Today I came up against a very frustrating problem when writing some Oracle PL/SQL stored procedures and functions:

  1. ORA-06502: PL/SQL: numeric or value error

When I first wrote the stored function in question, I was using VARCHAR2 types for storage since it was the data type returned by an Oracle provided package. The function signature looked something like the following example:

  1. FUNCTION MyFunction
  2. (pData IN VARCHAR2)
  3. RETURN VARCHAR2;

The stored procedures and functions in question were being used with a lot of character information. Whilst running small sets of test data through the functions, everything was acting as expected. Unfortunately, as the test data sets increased in size I began to receive the ORA-06502 error.

As you may or may not be aware, within PL/SQL a VARCHAR2 type can store a maximum of 32767 bytes of information. When the ORA-06502 exceptions where taking place, this limit was being exceeded.

The thing that was so frustrating about the error, was that it wasn’t helping me identify the problem. In this particular instance, I had refactored a significant amount of PL/SQL and during that process changed some variables from VARCHAR2 into CLOB data types.

Oracle will allow you to pass a CLOB variable into a function that accepts a VARCHAR2, so long as the length of the CLOB is less than the maximum byte limit of the VARCHAR2. Since that wasn’t throwing a type conversion error in normal circumstances, it wasn’t something that I went looking into immediately as a possible problem when the Oracle ORA-06502 errors were thrown.

The solution to this particular problem is what you would expect, change the data types of all associated functions and procedures to use the CLOB data type:

  1. FUNCTION MyFunction
  2. (pData IN CLOB)
  3. RETURN CLOB;

After looking into the error in more depth, it can be thrown for virtually any generic constraint violation. The following simple examples would produce this error code:

  • assign a NULL value to a variable defined as NOT NULL
  • assign a value greater than 99 to a variable defined as NUMBER(2)
  • assign a CLOB with a length greater than 32767 bytes to a VARCHAR2 variable

I think it would have been a little more useful to a developer for Oracle to throw some sort of a type conversion error in this instance.

The ConTest, Anticlimactic Crap

Channel Ten launched their latest game show offering named The ConTest last week. Like all new game shows, The ConTest was hyped as the single greatest thing since sliced bread; unfortunately for Channel 10 – their latest TV show just didn’t cut the mustard.

With a name like The ConTest, you’d be inclined to think that the game show has something to do with conning your opponent. Channel Ten advertised it as the game show where having street smarts was far more important than being smart and that you could win the top prize without answering a single question correctly. The ConTest delivered on the last bit, you could win it without answering a single question right – however it really didn’t deliver anywhere near enough ummpphh on the con part.

At the start of the show, all of the contestants get the opportunity to tell their opponents about themselves whilst the opponents get a chance to ask some questions of them. The idea is the the opponents get a chance to suss out their fellow competitors and get a feel for each others personalities and more importantly if they are bluffing or not.

During the show, Andrew G asks a series of questions to the contestants. Unlike a traditional game show though, the contestants do not know how each other are going; they are flying blind. At the end of a round, without knowing how many questions their opponents got right – they have to decide if they are going to continue on or fold. If they fold, they get to leave the game with the money they have accrued so far. If no one folds for the round though, the person with the lowest score is automatically eliminated from the game.

After the first round, I was immediately thinking that it wasn’t all that exciting or filled with suspense. When the contestants went up to the next podium to decide if they were going to fold or not, I was expecting the other contestants to challenge each other about their scores. Unfortunately, that really didn’t happen and just like that you’re thrown back into another series of rounds, just like the first – boring.

When I first saw the game show advertised, I immediately expected it was going to be a game show version of the card game Bullshit. In the card game, you have to throw down as many cards as you can, as fast as you can whilst bluffing your opponent into thinking you actually have/had the cards in your hand that you’ve just thrown down. If one of your opponents think you’re bluffing, they can call bullshit at which point they check what cards you threw down. If you get caught out, you have to take the cards back – if your opponent gets it wrong then they need to pick up your cards. Bullshit is a fun and fast paced card game because you get the opportunity to read your opponent and challenge them when you think they are lying through their teeth!

Unfortunately, The ConTest cannot live up to the lofty expectations set by the classic card game; instead its a bunch of people answering questions and it doesn’t matter if they are right or wrong. At the end of each round, someone goes home without a lot of fuss and after a few rounds someone wins some cash.

Sorry Channel Ten, it’s as boring as bat shit to watch.

Google Image Labeler Included In Google Webmasters Tools

In September 2006, Google released a new utility in the form of a game named Google Image Labeler.

The game aspect of the Google Image Labeler involves a pair of people. The contestants are chosen at random to play against one another based on who is online at any point in time. Each game lasts for 90 seconds and the contestants are shown the same series of images which they have to tag or describe with words or phrases. The contestants gain points when they match words or phrases with their opponent.

By gaining points when you match words with your opponent, Google are assuming both contestants consider the image to reflect the same object. At some point, Google will end up using this information in Google Images to provide a better quality of service to their customers.

The service aspect of the Google Image Labeler is of course about providing a higher quality of service to the Google user base. At the moment, Google rely on webmasters providing context around any images that they use on their web sites. As a simple example, a webmaster might:

  • provide a meaningful name for the image
  • provide a useful alt attribute, which describes the image in text format
  • provide captions for the image, which might be a more in depth text description of the image
  • talk about the image in the main content on the web page

Whilst this mechanism is very useful and in most cases accurate, it can also be inaccurate or abused. By relying on random Google users to categorise the images, the chances of an image being misrepresented are vastly reduced.

Having humans categorise the images also lends itself to Google producing software that learns how to recognise images. Google could attempt to identify what the images are on their own and use the tags or labels provided by the Google user base to essentially compare or validate the results.

When logged into the Google Webmasters console, you are now able to select whether or not you want the images on your site to be visible to the Google Image Labeler service. At this stage, I’m not quite sure why you would opt out of it – however Google are giving webmasters the option should they choose to.

If Google do end up walking the learning machine path, it could be interesting times ahead for the image searching service.