Home Lay Value Picks - 4th-7th Dec 2015


Lay the home team if 1.90 or below just before KO.

Lay Of The Day - Nov 2015

Lay Of The Day
What a great month and what a disaster at the end!

November continued a winning trend that started at the tail end of October yet finished with a losing week that resulted in an overall loss of some 2pts. Yet it looked like being so different.

A new record of 18 winning lays in a row was achieved. So that's a positive. So much so that I was beginning to entertain the idea of going through the whole month without hitting a winning lay. Unfortunately that was not to be, big time.

Having created a positive record during the month, a negative one was achieved as well. The last week has seen 4 failed selections from 6 placed resulting in a record draw down of 14.8pts. That's going to take a while to recover from.

Indeed, as I move into December, I can announce that this public trial will be brought to a close at the end of the year. That will be a few days over six months and more than enough time to assess its success. But I'm getting ahead of myself... There's four weeks of selections to go as I look to recover some of the losses that this past week has inflicted. I take comfort in the fact that the overall trend is still strongly pointing upwards.

As ever, you can follow the announcements on Twitter and you can review a spreadsheet of all selections and results by clicking on the image.

Home Lay Value Picks - 27th-30th Nov 2015

Another quiet weekend for the home lays...


Only lay the home team if they are 1.90 or lower just before KO.

Home Lay Value Picks - 20th-23th 2015

A full domestic card to choose from now the International week is done and dusted...


Only lay if home team is at 1.90 or less prior to K.O.

Home Lay Value Picks - 13th-16th Nov 2015

Very quiet weekend...


Only lay the home team if odds are 1.90 or less at K.O.

Home Lay Value Picks - 6th-9th Nov 2015

I'm afraid this is continuing to perform rather poorly this season for reasons I've yet to establish. I'll bash on with the selections though and hope for a turnaround...


Lay the home favourite if the odds are 1.90 or less at KO.

Lay Of The Day - Oct 2015

Lay Of The Day - Oct 2015
What a strange month, not to mention a disappointing one despite a small profit of a little over 4pts.

It got off to a poor start with the first lay losing 5pts but quickly recovered with 9 successful lays in a row excluding a DNQ.

DNQs were a bit of a bane in the latter half of the month. There were eight of the blighters together with three no shows as I was away. Very frustrating and unfortunately I let that frustration get the better of me. Having selected yet another horse that was well into DNQ territory I decided to discount it and go for my second selection instead.... to my cost. As a result, the return for the month is 6pts less than it should have been.

When I'm looking at possible lays, I don't look at the odds. In other words, the price is never a consideration in the selection process - but with the proviso that prices of 10.0 and above result in non-qualification. I deviated from that process and you saw the result. It serves as a reminder that no matter how experienced you are, discipline is key to long term profits.

Lesson learned (again, because it is easy to forget it). Always stick to the plan if you have one. The only time you should ever change it is if the plan is clearly a long term failure at which point you need to think of something else.

Despite that hiccup, the Lay Of the Day remains on a steady upward trend at +35 points after 4 months. Long may that continue.

As ever, you can follow the announcements on Twitter and you can review a spreadsheet of all selections and results by clicking on the image

Home Lay Value Picks - 31st Oct-2nd Nov 2015

It has been quite quiet on the home lay value front in recent weeks. This lot should make up for that...


Lay the home team if they are 1.90 or less just before kick off.

Home Lay Value Picks - 20th-22nd Oct 2015

Only the one selection from the midweek games this time around...


As usual, look to lay the home team if 1.90 or less just before KO.

Home Lay Value Picks - 16th-19th Oct 2015

A surprisingly small selection for a weekend....


Lay the home team if 1.90 or less just before KO.

Home Lay Value Picks - 2nd-5th Oct 2015

 Another busy weekend for the value lays...


Lay the home side if they are 1.90 or less at KO.

Lay Of The Day - Sept 2015

Lay Of The Day - Sept 2015
Another month gone and the Lay Of The Day selections continue to perform well.

With over 31pts acrued since the start of the trial, the lays are averaging a tad over 10pts per month. If that sort of return continues over the coming months I will be delighted.

Sept saw a profit of 10.84pts after 5% commission was taken into account. In all there were 18 lays with only 3 of those winning their respective races. This is a particularly pleasing figure as there were a few days were, due to other commitments, I was unable to submit a selection.

In addition, there were 7 selections that did not qualify as their BSP was into double figures. Having said that, one of those selections won at BSP of 11.29 so something of a lucky escape there - but that's why the criteria is there in the first place. This tends to happen primarily at the bigger meetings of which there were a couple during the month.

As we move firmly into the realms of the national hunt season, it will be interesting to see how the lays progress. More of the same will be most welcome.

As ever, you can follow the announcements on Twitter and you can review a spreadsheet of all selections and results by clicking on the image.

Home Lay Value Picks - 25th-28th Sept 2015

A fairly quiet weekend by recent standards...


Home teams to be laid if 1.90 or less just before KO.

Home Lay Value Picks - 22nd-24th Sept 2015

Only three midweek selections this time around...


Only consider laying these if they are 1.90 or under just before KO.

Home Lay Value Picks - 18th-21st Sept 2015

Another decent sized batch of value home lay for your deliberation...


Highlighted selections to be laid if pre-KO price is 1.90 or below.

Time For A Break

One of the benefits of trading as a hobby is the lack of pressure to earn a living from it. As such I have no worries of missing opportunities in order to pay the bills. I can walk away from it for periods without any issues.

My annual subscription to Fairbot just expired which presents a very convenient opportunity to take a break. Why? Well, one of my New Year resolutions for this year was to finally get around to building myself a new hifi amp - something I've been wanting to do for years but never seemed to find the time. Well, I've now collected virtually all the parts I need so it's time to start building.

I'll be constructing something called a Gainclone. Most of the work will revolve around creating a decent looking enclosure - the cost of buying a quality, attractive case can double the cost of the project. So the plan is to re-use and re-purpose an existing box with modifications as required.

I've no idea how long it will take me as I haven't done any electronics since University in the early 80s. So, assuming I haven't blown myself up, burned the house down or electrocuted myself I'll be back trading once it is done.

In the meantime, I still plan to post the Value Home Football Lays on the blog as well as the Lay Of The Day horse selections on Twitter.

Home Lay Value Picks - 11th-14th Sept 2015

Back on the trail of some value home lays after the international break...


The usual criteria - only lay these if the odds are 1.90 or less before the KO.

Home Lay Value Picks - 4th-7th Sept 2015

After  a bit of a shocker last weekend it is perhaps fortunate that the International weekend has severely limited the matches to choose from.

Only two selections with the usual criteria of not laying above 1.90.


Home Lay Value Picks - 28th-31st Aug 2015

Another unusually long list of selections to choose from. Only lay those home teams that are 1.90 or under just before kick off.


Lay Of The Day Hits 20pts Profit

Lay Of The Day
It's been a touch over two months and with today's lay of the day selection bringing home the goods, I'm delighted to say the trial has hit the 20pts profit mark.

Though not quite over, and at the risk of tempting fate, August has proved to be an excellent month. This is in stark contrast with July which suffered from a ten day spell where it was difficult to establish any momentum. On reflection, that was largely my fault as I tried to be clever and fine tune things after a good start to the trial. When I went back to keeping things simple, progress picked up again. Lesson learned.

I originally intended this to be a relatively short trial. However, encouraged by the performance this past month, I think it is sensible to continue with the exercise. This is especially true as we transition from flat racing to jumps racing over the coming month. The more data points I have, the greater the quality of that data.

So, look out for the selections continuing over on Twitter and you'll see a full list of selections and performance if you click on the image.

Home Lay Value Picks - 21st-24th Aug 2015

I don't recall ever having so many selections over a weekend. Hopefully this lot can do something about the poor start this has had this year...


As usual. 1.90 is the maximum odds these should be laid at.

Home Lay Value Picks - 18th-20th Aug 2015

I'm afraid to say this year has got off to a shocking start. After a bit over two weeks, the selections are already 11pts down. Very disappointing. Still, there's another 50 weeks to go so no panic yet.

There's a few midweek selections this time around. The usual criteria applies; only lay these if they are 1.90 or less.


Home Lay Value Picks - 14th-17th Aug 2015

Plenty of selections again this weekend...


Lay the home team if you can get 1.90 or less.

Home Lay Value Picks - 11th-13th Aug 2015

A whole host of midweek matches taking place this week and I've only managed to find three; all of them in the Capital One Cup...


Lay the home favourite only if you can get 1.90 or less.

Home Lay Value Picks - 7th-9th Aug 2015

The domestic season gets fully under way this weekend and that is reflected in the number of selections available...


Lay the home team if you can get 1.90 or less.

Home Lay Value Picks - 1st-2nd Aug 2015

The start of a new year and a new season and the Back Or Lay Picks have morphed into Home Lay Value Picks.

As alluded to in an earlier post, the Back element of the previous incarnation has proved somewhat inconsistent over the years so I've decided to drop it and concentrate on laying home teams that I think are too low in the odds. The same rules apply for this:

1. Only lay if you can get odds of 1.90 or lower
2. For the purposes of recording results, I'll be using the odds recorded at BetExplorer.
3. Recorded profit and losses are based on a 1pt liability with 5% commission deducted from profitable lays.
4. Picks will be given for weekend matches and again for any midweek games.
5. Selections will be posted on the blog though I'll announce their presence in Twitter.
6. Follow at your own risk

At some stage I might get around to posting a results spreadsheet online but for now, let's bash on with the selections:


The selection of Celtic is perhaps one to raise eyebrows, but the numbers are indicating it's a value lay so I've got to go with what the sums are telling me.

Back Or Lay 2014-2015 Final Summary

As there are no more suitable Back Or Lay selections for this month, indeed, this calendar year, I'll take the opportunity to summarise the performance for the year 1st Aug 2014 - 31 July 2015:

Bet TypeNo. Of BetsPts P&L
No Bets19NA

An overall ROI of 6.68% is an excellent result though I'm disappointed with the performance of the back bet selections which have been consistently poor all year. Ignoring those we see that the lays on their own produced an ROI of 12.82%.

Having reviewed my results over the past few years it is clear that the back selections are just too inconsistent. As a result, I'm going to drop them for the coming year and concentrate on laying home teams that I think are on the short side.

With the domestic leagues starting soon we'll soon see a sharp rise in the number of selections available. Stay tuned or follow on twitter as the calendar year starts on 1st August 2015.

When Is A Trend Not A Trend?

If you do not keep records of your trading and trials, spotting patterns and trends can be difficult, if not impossible. I've always tried to do so and the past three days have reminded me of its importance.

I'm investigating a new strategy that makes use of the new automation facility in Fairbot. I don't intend discussing the strategy itself but it has immediately flagged a phenomenon (I'm reluctant to call it a trend after three days) which I have often seen in the past. Indeed, I'm sure I've posted about it before.

Since starting my life on Betfair, I have frequently ruined a good day by burning much, if not all of my profit in the last hour of racing. Whether I was scalping or swing trading; dobbing or laying the field it was remarkable how often I'd be doing quite well in the first 2-2.5 hours of trading only for one or two races to come along late in the card where I'd throw it all away.

I never really found an explanation for it other than my own stupidity. I put it down to tiredness on my part. Having spent all afternoon concentrating on the markets, I just assumed my concentration had slipped and I'd made a stupid mistake. This was so frustrating that I took to stop trading come 16:30, or at least I'd drastically drop my stakes. This latest trial calls that into question since the work is being done automatically and, for the moment, I'm using pretend money.

The phenomenon is weird because it seems to be restricted to dinner time. Specifically, between 16:45 and 19:00. Let me give you some figures to illustrate my point:

Using stakes of £50 the strategy has covered 74 races, losing 10 of them for a hit rate of 86.5% and a profit of £447. This is an excellent result and I'm delighted with it. However, when I looked through the results over the past three days I found that 80% of those losing trades happened during that dinner time period. Weird!

That time period produced a loss on each of those days. If I stripped those results out, the strategy had 2 losers out of 53 for a hit rate of 96.2% and a profit of £669. That's a huge difference! If not executing this strategy on these races produces such a huge upside, would it be possible to purposefully reverse this strategy for these races to boost things further? A very interesting question. 

Now whether this is truly a trend, only time will tell. It is certainly worth keeping an eye on and I'm very interested to see how it pans out as I continue my trial. If a long term trend does prove to be present I feel I'll still be at a loss to provide an explanation. One thing though; because of the automated nature of what I'm doing, I'll be happier that it isn't my fault. I'd still like to know why.

Early days for this strategy but it is important that you record your performance, irrespective of what strategy or trading technique you are employing. If you've never done it before, it can be a bit of a drag, but you'll soon get used to it. The potential benefits can easily outweigh the effort spent.

Back Or Lay Picks 10th-12th July

A couple of selections in Norway for this weekend...


These picks should be backed if decimal odds at KO are 2.00 or over.
They should be laid if their odds are at 1.90 or under.

Lay Of The Day Trial

For those of you who do not follow along on Twitter (shame on you!) I'm two weeks into a trial where I post a horse selection for lay of the day. At the time of posting, there's been 14 selections consisting of 11 losers, 2 winners and 1 NR resulting in an ROI currently sitting at more than 20%.

Past results are available in a Microsoft Excel Online spreadsheet and selections are announced on Twitter (@betyourlifest) usually by lunchtime on the day in question. Check the results out by clicking the image.

Fairbot Automation Conditions Setting Tips

For those of you who use Fairbot and in particular the new automation facility, I thought I'd bring your attention to some important considerations when configuring a rule.

You must be clear about what you are trying to do and how the market behaves and the way your rule will react to changes in that market. Specific care must be taken when creating any 'conditions' that will trigger your rule to place a bet or trade in the market.

For example, imagine you wanted to back the first runner that fell below odds of 5.0 say. If you refer to the image below you'll see that there's one of three options you can select which, when met, will trigger the rule.

Care must be taken when selecting either 'Back Price', 'Lay Price' or 'Last Traded Price'. This is particularly true if your new rule is going to be used in-running.

Given the large gaps that can rapidly appear and disappear in a fraction of a second, it is quite possible that the money on an outsider may disappear from the market momentarily thus bringing the first available back price below the trigger point of 5.0. If you've selected 'Back Price' as part of the condition's details then the rule will fire and your bet/trade will be placed on the outsider. This may not be what you intended.

In this particular case, it would be better to use 'Last Traded Price' or, though perhaps counter intuitive, 'Lay Price'. I prefer the latter as it guarantees that the runner is definitely trading below your trigger point rather than it being triggered by a large gap in the prices or a momentary spike/dip.

So, make sure you understand your market and be aware which conditional setting would be the better option.

Care must be taken though when testing these in simulation mode. The software can never know if your stake would have been taken in the real market and therefore it has to make some assumptions. In addition, the refresh rate in simulation mode appears to be throttled. If you don't believe me, switch simulation mode off and on close to the off with the refresh rate set to 0.2 secs and compare the rate of change of the numbers in the grid. This will also have an impact on your rule when testing in simulation mode.

After testing, make sure you test in real mode with small stakes until you are satisfied that your rule is behaving as you expected it to.

Back Or Lay Picks 28th-29th June

It's been a while since the last selections, but pickings are always sparse during the summer months. I do have one this weekend for you in Norway though...


These picks should be backed if decimal odds at KO are 2.00 or over.
They should be laid if their odds are at 1.90 or under.

Fairbot Support

I've had a number of requests recently asking for support on using some feature of Fairbot. One determined user managed to track down my mobile number and send me a text message.

I am happy to answer questions directly related to posts I've made or videos I've uploaded but please note I am NOT an employee of Binteko, the developers of Fairbot. Nor am I involved in providing paid support on their behalf. My only connection to them is that I'm a user and I have an affiliate link on the 'Trading Software' page.

If you require assistance using Fairbot I suggest you contact them directly via their contact page on their website. My experience of their support is a positive one and they are usually prompt with their responses.

In the meantime, I'll happily respond to feedback on my youtube channel or blog posts but I'm not here to provide free support for a product for which I have no financial involvement.

Back Or Lay Picks 30th-31st May

Off to the continent for all of today's selections...


These picks should be backed if decimal odds at KO are 2.00 or over.
They should be laid if their odds are at 1.90 or under.

Racing Database Redesign - Part 14 - Manual Error Checking

I thought I'd take a moment to discuss the importance of periodic checking of your database to ensure that your automated systems are doing what you expect them to do. This is all the more critical when your database pulls in data from a third party to which you have no control over - as is the case for my horse racing database.

I've already mentioned a case in point where a couple of data files from Betfair had empty MENUHINT hint fields. This resulted in importation of data failing because the BFShortName for the venue could not be ascertained.

In that instance, I was able to programmatically check for that missing data and substitute a dummy value to enable the data importation process to continue. I now still have to periodically run the following SQL query to identify any events where the BFShortName is unknown. I can then go in and manually create it.

SELECT * FROM 'event' WHERE 'bfshortname' = 'Unknown';

This weekend, and purely by chance, I discovered another issue which wasn't trapped by the error checking in my script.

The TimeForm site does not seem to use the same values for eventid and selectionid for it's historical results pages as those that are contained within the data files downloaded from Betfair. For some reason, I had always assumed they did. I was in the process of confirming that when I discovered that the race data from the data file released on 22 May 2015, i.e. races held on the 21st, hadn't been placed into the EVENTDETAILS table. All the race data like MinIP odds had all been calulated correctly and entered into the EVENT table, but the data for each individual runner had not been entered into the EVENTDETAILS table.

This had not been trapped by the error checking in my script, yet clearly there was an issue. Take a look at the image below which shows the data file in question as loaded into a spreadsheet:

The highlighted rows illustrate the problem. That data shouldn't be in the file. I've yet to ascertain why the error checking in my script didn't catch it but now I know it is there I can search for it and remove it before the data is sent to the corresponding tables.

This is the issue with third party data. It can be changed without notice or can contain errors the nature of which are impossible to predict. Once identified, code can be put in place to trap those conditions but one can never be certain that some unforeseen condition isn't getting through.

So, like the case of the missing BFShortName above, I will have to periodically check that every race listed in the EVENT table actually has some corresponding runners listed in the EVENTDETAILS table by running the following query:

SELECT * FROM 'event' WHERE 'eventid' NOT IN (SELECT 'eventid' FROM 'eventdetail');

It is manual checking like this that needs to be done periodically to maintain confidence in the integrity not of the database design but of the data itself. The trouble is, if you don't know what to look for, how do you query your database to find it?

That's a difficult question to answer, but hopefully the intimate knowledge that you have with your database design and the data it is meant to contain will guide you in the process.

Back Or Lay Picks 23rd-25th May

Not many selections this weekend as expected. With most of these at very short odds it will not need many to come in to produce a decent return.


These picks should be backed if decimal odds at KO are 2.00 or over.
They should be laid if their odds are at 1.90 or under.

Chat Room Requests

There's been a chat room on the Bet Your Life Sports Trading site since Sept 2007... at least until Feb 2014 when, due to lack of interest, I decided to pull the plug. Since the turn of the year I've had a number of enquiries about it and thought I'd pose the question:

If I brought the chat room facility back, would you actually use it?

I'm well aware that there are now many chat rooms available for traders, primarily provided by suppliers in order to support their products and services. That was never the intent of the original chat room, nor would it be if it was resurrected. I am not peddling any products or services so the chat room would be an independent facility where traders can hang out. Neither would I intend spending much of my time maintaining it.

If that sounds like something you might use then please leave a comment below indicating your interest. If there's enough numbers I'll look to setting one up again.

Back Or Lay Picks 19th-21st May

It has been something of a disappointing start to the last quarter of the year having gained only 0.28pts since the start of May. A case of keeping the faith until things pick up again... which they will. In the meantime, it is off to Sweden and Finland for the midweek selections this time around.


These picks should be backed if decimal odds at KO are 2.00 or over.
They should be laid if their odds are at 1.90 or under.

Back Or Lay Picks 16th-17th May

I seem to be taking one step backward for every step forward at the moment. Better than going backwards all the time I suppose...


These picks should be backed if decimal odds at KO are 2.00 or over.
They should be laid if their odds are at 1.90 or under.

Back Or Lay Picks 8th-11th May

A handful of picks for this weekend as the main Euro leagues draw to a close and I head North to Scandinavia...


These picks should be backed if decimal odds at KO are 2.00 or over.
They should be laid if their odds are at 1.90 or under.

Racing Database Redesign - Part 13 - Automated Data Import

At the bottom of this post you'll find a video demonstrating the automated data download and importation process in action. I don't go into great detail in that video so it is worthwhile flagging up a few pointers so if you are trying to do something similar you'll know what to look out for.

Obviously some of the information may not be directly related to your situation if you are not using MySQL and PHP. Although I have no intention of posting lots of PHP script code hopefully you will be able to glean something from it in anycase.

Firstly, let me describe the process that the automation script follows:

1. Assuming no dates have been specified, the script heads off to  https://promo.betfair.com/betfairsp/prices and downloads the results files for the UK Win and Place markets based on the current system date, i.e. the files that contain the previous day's racing. Although there are files available for racing in other countries and the database can handle it, I just want to concentrate on UK racing.
2. It imports that data into temporary tables - one for the win data and the other for the place data.
3. It then analyses those tables on a race-by-race basis and, using internal MySQL commands, it splits that data and inserts it into the corresponding tables. It also calculates other information from that same data.
4. Once it has gone through all the races, it empties the aforementioned temporary tables, produces a summary of what it has done by writing everything to a log file then quits.

If the script was provided with a date range, steps 1-2 are repeated for each and every day within that range.

The script makes use of the PHP 'cURL' library to download each datafile, stick it into a variable then save the variable contents to a file on my server. It then uses the MySQL 'LOAD DATA INFILE' to quickly insert the contents of that file into temporary holding tables from where the data is then analysed and manipulated using internal MySQL commands.

While it is perfectly feasible to use PHP to manipulate and modify the data on a line be line basis before using the MySQL 'INSERT' command to insert the data into the corresponding tables, the method I'm using is significantly faster. As you'll see from the video below it takes 1-2 seconds for each day's worth of data. If you are not using MySQL, I strongly advise you investigate what bulk importation funtionality your software supports.

Finally, let me flag up a few things that you might want to consider:

1. Although the data files holding the results for the day are generally available by 10am the following morning, this is not always the case. Sometimes there is a delay so you might want to put off the download process until later that evening.
2. Very occasionally, Betfair have issues and the files aren't uploaded for a number of days so you may want to set up a mechanism that records that fact so that you can go back at some point to correct it either manually or automatically. I find a text based log file suits my fine, though you could have errors written to a database table instead.
3. Once you've downloaded the datafiles, there's no need to keep them after the data is imported into your database so you could set up an automated delete mechanism once the database is up-to-date.
4. If you configure the script to run automatically without any user input, you might want to consider having the log file emailed to you if you are away.

Now that everything is up and running, it is time to concentrate on the interface so I'll draw this series to a close. At some later stage I'll come back to introduce the interface but that is likely to be a while.

In the meantime, I hope you have found some of my ramblings of use and if you have any questions, just leave a comment below.

Back Or Lay Picks - 9 Month Summary Aug 2014 - Apr 2015

There's no midweek selections this time around so I'll take the opportunity to summarise this year's performance to the end of April 2015:

Bet TypeNo. Of BetsPts P&L
No Bets18NA

An overall ROI of 5.9% is not to be sneezed at though the back bets continue to drag things down as they have all season - disappointingly so. Still, very happy with an ROI of 12.6% for the lays.

Selections will start to dwindle now that the main European leagues are drawing to a close. Picks will continue though during the summer as I take in the main Scandinavian leagues of Finland, Norway and Sweden.

Back Or Lay Picks 1st-3rd May

After a complete washout midweek there's a fair few selections this weekend that will hopefully redeem the situation.


These picks should be backed if decimal odds at KO are 2.00 or over.
They should be laid if their odds are at 1.90 or under.

Racing Database Redesign - Part 12 - Data Import Considerations

As I am using MySQL to host my database and will be automating it with PHP (or Python) scripts much of the work I have to do now will be somewhat specific and therefore only of interest to those employing the same technologies. With this in mind, I'll try and keep what future posts I make somewhat more general in nature. I certainly don't intend to post screeds of PHP scripts on here.

I'm assuming you've created your tables and are keen to put them into action but before you do, there are a number of things you need to be aware of:

1. The data contained within the downloadable files needs to be split up into the corresponding tables. However, that splitting must be done in a particular order. Specifically, any data that belongs in a table on the 'one' side of a one-to-many relationship must be entered first.

(i) To this end, the VENUE table must contain an entry corresponding to the 'BFShortName' before the event information can be entered into the EVENT table. Fortunately the VENUE table can be populated in advance by referring to the spreadsheet I introduced to you previously.
(ii) The event information must be entered into the EVENT table before all the individual horse data for each event can be entered into the EVENTDETAILS table.
(iii) The same applies to the HORSE table though the JOCKEY and TRAINER tables are OK assuming you've specified that the foreign keys 'JockeyID' and 'TrainerID' in the EVENTDETAILS table can be NULL.

2. The data from the Win market must be entered into the EVENTDETAILS table before you insert the corresponding Place market data.

3. Make sure you are familiar with the downloadable files and how they are structured. Specifically:

(i) The EVENT_DT field is formatted as a string like this - "DD-MM-YYYY HH:MM". If you are using MySQL it will expect the format to be "YYYY-MM-DD HH:MM". This may be the case for other database management software as well.
(ii) The MENU_HINT field contains the course name in short form that is required by the 'BFShortName' field in the database. It also contains the country code at the beginning and the date at the end. Both these items need to be stripped away to leave the short form course name.
(iii) You'll see figures for BSP, PPWAP etc are specified to a large number of significant places. There's obviously no need to go to that level and 2-3 decimal places are fine.
(iv) The first item on each row of the data file is the EVENT_ID. Note that the data contained is not necessarily in order nor are the runners necessarily grouped together by EVENT_ID.

Once your database is up and running, it is perfectly feasible to manually import the previous day's data by manually downloading the file, modify it as required above if necessary and import it using whatever means your database software supports. It's a rather time consuming process but it is possible.

However, with well over 4500 win and place market data files for the UK alone - at the time of writing - stretching back to 2008, doing this manually is clearly impractical.

Obviously the sensible thing to do is create some automation that will (at least initially) download, edit and import all the results from the previous day's racing. Once that is fully working it should be a trivial matter to expand it to loop through all the downloadable files and install the data in one fell swoop.

That's what I'll be looking at next time around and pointing to the things you need to look out for when doing so.

Back Or Lay Picks 28th-30th April

The season is slowly drawing to a close but note, these picks don't. I continue during the summer with the Scandinavian leagues though the number of picks will be somewhat less.


These picks should be backed if decimal odds at KO are 2.00 or over.
They should be laid if their odds are at 1.90 or under.

Racing Database Redesign - Part 11 - Design And Admin Software

Just a quick note about design and admin software you can use on your project.

All the modelling I've talked about so far does not require any use of design software. Indeed, you don't need to go anywhere near your database application until you have to physically create the tables.

If you are using MS Access as your database, you can graphically create your relationship between tables and the software will prompt you accordingly for various settings. However, the tables have to be created beforehand - or they did in the last version of Access I used in anger. So you would have to do the modelling up front before creating the tables. You can of course invest in ERD design software but for small projects like this, it is unnecessary.

If you are using a MySQL server based database system like I am, particularly hosted on a Linux box, you have a number of options available to you. Note that MySQL is also available for Windows servers:

1. Webmin - A web based Linux server administration tool that lives on your server which comes with a MySQL administration module. It's primary role is for administering the server rather than creating databases but it does allow basic tasks such as creating/deleting tables, running SQL queries and the like. There is no ERD tool however.

2. phpMyAdmin - Another web based tool that lives on your server (Linux or Windows) but this is specifically designed for MySQL server admin and database creation. As such, it is much more sophisticated and offers complete control over your database design process, including an ERD module in which relationships can be graphically created and configured.

3. MySQL Workbench - A cross-platform client administration and database design tool giving you complete control over your MySQL server and database design process. The modelling feature is particularly useful in that it allows the design of the database to be done, including ERD, without the need to create the tables (the table fields and settings still have to be defined as part of the process). Once you are happy with the design the tables can be automatically created on the server with just a few mouse clicks. It's particularly useful for designing large databases over multiple servers.

While I like MySQL Workbench, I use phpMyAdmin as my tool of choice. Not only am I more familiar with it, but given the scale of databases I tent to create it suits my purposes perfectly. Also, because it is web based, hosted on my server, I can access it from any device without having to worry about installing software on each client.

The image above was created using phpMyAdmin's designer tool. You'll notice that there's a few additions to the EVENT table from last time. This is to allow me to analyse some of the in-running detail on a race-by-race basis. More of that nearer the time.

Back Or Lay Picks 25th-26th April

An unusually quiet weekend this time around...


These picks should be backed if decimal odds at KO are 2.00 or over.
They should be laid if their odds are at 1.90 or under.

Racing Database Redesign - Part 10 - Tables (Update)

So much for taking some time out from blogging to create the database! As soon as I started I noticed an issue with the Betfair data that would upset things slightly.

The VENUE table is, for all intents and purposes, pretty static in that the data it contains isn't going to change very much if at all. As a result, this table can be easily pre-populated as part of the design process using the information available from TimeForm and the spreadsheet file I flagged up in my last post.

If you look back to the ERD, you'll see that the VenueID is the key field. This identifier is not directly available from the downloadable Betfair results files but I wanted to use that as the key (i) because I assumed it was unique and (ii) I could use it within any future interface to call up the TimeForm race card by programically creating the URL.

The image below illustrates why I intended to use it. You'll see the CourseID of 1.9 in the URL which identifies Catterick Bridge.

Each course will have a corresponding CourseID and it would be a trivial process to retrieve that from the VENUE table and build the corresponding URL.

However, all is not as straight forward as it seems. There are a number of issues I discovered when populating the aforementioned spreadsheet file with these identifiers in preparation for importing to the database:

1. Timeform treats Newmarket as two separate courses, namely Newmarket (Rowley) and Newmarket (July) with CourseIDs of 1.36 and 1.60 respectively yet the downloadable results files, the aforementioned spreadsheet and as far as I can make out, API software all treat them as one using the short form name of 'Newm'. This then makes it difficult to marry the results files to the correct venue in the case of Newmarket.

2. After the shortlived Great Leighs course arose phoenix-like from the ashes with the new name of Chelmsford City, Timeform simply reassigned the same CourseID to the name of Chelmsford City. A reasonable thing to do you might think except, if you change the dates on the Timeform site to 8th Jan 2009 you'll see that a meeting that was held at Great Leighs was apparently held at Chelmsford City - long before such an entity existed. You and I know that they are one and the same location, but software doesn't.

While this may not necessarily be an issue in and of itself, the historical results files for the time that Great Leighs was operational refer to it's orginal name not its reincarnation. If this issue isn't taken care of, the importation of all those races at Great Leighs will fail.

3. It is not an operational issue but you'll also find the venue long name in the spreadsheet file differs from the long name used by Timeform. For example, Epsom v Epsom Downs; Fontwell v Fontwell Park; Stratford v Stratford-on-Avon and so on. You may want to edit the spreadsheet accordingly before importing into your database.

So, what do I do about it? Fortunately there is a relatively simple solution that doesn't involve a major redesign.

If you look at the short name forms in the spreadsheet and the downloadable results file you'll see they are the same. More importantly, they are unique. Consequently, the BFShortName field in the VENUE table can be used as the table key and the VenueID field can be demoted to a normal field containing the code needed to programmatically build the URL to Timeform.

This means that the Great Leighs/Chelmsford City schizophrenia is taken care of by having two entries each with the same VenueID and it turns the Newmarket issue into a minor anomaly where creating its URL for Timeform will present a little, not insurmountable problem.

I won't present an updated ERD at this point just in case I come across other issues with the data that force a change on me. Instead, I'll hold back on that until I've created all the tables and I'm satisfied there aren't any more issues.

Racing Database Redesign - Part 9 - Tables (Final Design?)

The table design is drawing to a conclusion now, but there's one more important thing to do before reviewing the final design for any omissions or errors.

When setting up a database, it is not enough to establish the tables and the relationships between them. One must also consider the data that is being stored; what 'type' it is and what 'size' it is. This is important as it helps to ensure that the amount of storage space required is minimised and only the correct type of information is stored in the corresponding fields. To illustrate this point, let me take a simple example...

Imagine you create a table field that is intended to hold the names of Jockeys. As you don't know how big peoples' names may be you create that field so that it can hold 200 characters. As your table fills up over time you may find that the maximum length of any Jockey's name is 25 characters. This means that that one field in your database is wasting 175 characters worth of data storage for EVERY record in that table.

If you've done the same type of thing in all fields, over all tables throughout the database I'm sure you'll see that that wasted space can mount up very quickly.

Also, specifying the type of data each field will hold helps to eliminate data input errors. So, if a field should only contain numbers any attempt to enter text will result in an error.

For a full list of data types available reference should be made to the specific database software you are using. In any event, you can use the ERD to document what data types and settings are required for each and every field within your database.

I use the following format:

datatype(size if applicable), NULL [yes|no], DefaultValue

So, using the EventID field as an example, its datatype is set to 'int' (that's an integer) with a size of 20 characters; NULL is set to 'no' - in otherwords it cannot be empty; and finally there is no default value set - in otherwords its value must be specified manually or programmatically.

The final ERD below lists all the settings required, though some will need further explanation.

Generally speaking, it is best to make sure each field cannot be NULL, i.e. empty. This is not possible with this database simply due to the nature of the data that is being recorded and where it is coming from. So you'll see all of the tables except EVENTDETAIL have at least one field specified to allow a NULL value. Unfortunate, but it cannot really be helped in this case - at least not until I can find sources for that historical information.

Also note the use of the 'varchar' data type. I'm sure MS Access has a similar data type but I cannot remember what it is called. When using MySQL as your database you will find that indexing fields that contain text performs much quicker if the field is configured as a 'varchar' rather than 'text'. I'm not privy to the technical reasons this is so though I'm sure there will be some explanation on the Internet for those who are interested. Suffice to say, if there's any chance you may wish to perform any form of querying on a text field, configure it as a 'varchar'.

I should flag up one important point that you need to consider with regard to the data types that you specify:

Virtually all of this data is being provided by a third party, namely Betfair. As such, neither you or I have control over that data. Currently, Betfair provide internal IDs such as EventID as a numerical field. Though unlikely, there is nothing to stop them changing that to an alphanumeric field for example. In that event, any attempt to store such EventIDs into the current design would fail.

I don't think Betfair would implement such changes, but I feel it is important that you are aware of such a possibility. In other words, be aware of where your data is coming from, how much control you have over it and the potential impact it may have on your design.

Table and Design Review:

At this point it is worth doing a final review of the design to ascertain if everything is covered or indeed if anything has been missed not only from the design but the original specification. In no particular order...

1. You should note from the ERD I've changed my mind about having separate date and time fields in the EVENT table. I'm not sure why I suggested it in the first place as, when I checked my old database, I had indeed stored them as one in 'datetime' format.

2. Note that I've included two fields in the VENUE table that ostensibly hold the same information, namely Name and BFShortName. The Name field will hold the full name of the venue whereas BFShortName will hold the abbreviated venue name as contained within the downloadable BF results files as well as the menus you might have seen in trading software. A full list of these pairings can be downloaded from Betfair as an MS Excel file. Using this file, the VENUE table can be pre-populated though you still need to find the corresponding VenueID values.

3. Although the place market downloadable result files indicate whether a runner was placed they do not indicate its finishing position. Consequently there is no way to ascertain a horse's past form figures from those files. The form figures are available from the Betfair API or numerous websites and the best place to place that information would be in a text field stored in the HORSE table. This field would need updating every time the horse ran.

4. A general rule of thumb within database design is this: if something can be calculated from existing data then there is no need to store that result within the database. Depending on the complexity of the calculation required, this can increase processing time for every executed query. If storing that info upfront reduces the overall processing time then it may be worth ignoring this general rule.

Take for example the well known Lay The Field strategy. As an exercise, consider how you would go about analysing the historical record using the ERD design above in order to ascertain which venues, race distances, field sizes, race types, etc were statistically good for implementing the LTF strategy - answers/suggestions in the comment section please.

5. It is worthwhile reviewing other sources of information or websites to assess whether other suitable information exists that might be usefully included. For example, at the bottom of every card on the Timeform site is listed the 'Betfair betting forecast' - a list of projected BSP for each runner. Could that be used to develop any future strategy? If you do not record it you'll never be able to analyse the data so you'll never know.

6. Note that the design is generic. While I am principally focussing on UK horse racing, you'll see that the downloadable datafiles all have the same format. Consequently, races from Australia, Ireland, South Africa and USA could be included. As an added bonus, UK greyhound races (granted jockey info would be irrelevant here) could also be recorded within the same database - or more suitably a database with similar structure. Don't say I don't give you value for money!

As you can see from the above points, not only can a final review flag up any omissions but it can lead to expansion beyond the remit of the original specification. The normalisation process which I discussed in an earlier post helps to facilitate this but the database designer must be able to draw the line somewhere and for me, this is it. I'll make a few tweaks as outlined in the review above but the design is now essentially finished.

I hope you've found this series of some use and feel able to have a go yourself. I'm going to take a break for a couple of weeks as I start to implement the design. I'll return with a few posts to discuss some of the automation side of things. In the meantime, let me wish you good luck with your endeavours.

Back Or Lay Picks 21st-22nd April

Back to Milton Keynes for the single selection from this week's midweek games. Here's hoping they do a little worse than their 6-1 drubbing of Leyton Orient at the weekend.


These picks should be backed if decimal odds at KO are 2.00 or over.
They should be laid if their odds are at 1.90 or under.

Back Or Lay Picks 18th-19th April

Going through a little flat spot at the moment so here's hoping this lot will pick things up again. Currently 20.3 points in profit since the beginning of August.


These picks should be backed if decimal odds at KO are 2.00 or over.
They should be laid if their odds are at 1.90 or under.

Racing Database Redesign - Part 8 - Tables (continued)

The main tables are essentially finished in terms of providing a model for a horse racing database. However it is not quite there yet as far as recording the results data from Betfair - the main reason for doing the thing in the first place.

The data files located on Betfair are provided as comma delimited files (csv files) which can be imported into a spreadsheet or, in my case, a database. Before that can be achieved it is necessary to take a look at their structure. The above link will take you to a page listing all the downloadable files going back to late 2008. It also has a table describing the file contents.

For your conveinience and with thanks to Betfair, I've reproduced the table below. I've also taken the liberty of including an extra column containing the data of one record from one of those files to which I will refer during the following discussion.

EVENT_IDBetfair internal event ID118128024
MENU_HINTContains the name of the race meetingGB / Aint 9th Apr
EVENT_NAMEThe race name as it appears on the Betfair menu - place markets are always 'To Be Placed'2m4f Grd1 Nov Chs
EVENT_DTThe start time of the race09-04-2015 13:40
SELECTION_IDBetfair internal selection ID5310463
SELECTION_NAMEThe name of the runnerCash And Go
WIN_LOSE1 if the runner won (or placed), 0 if it lost (or was unplaced)0
BSPBetfair Starting Price22
PPWAPWeighted Average Price of all bets placed pre-off20.4506469383
MORNINGWAPWeighted Average Price of all bets placed before 11am GMT19.8992724833
PPMAXMaximum price placed before the off (only bets of a payout of more than GBP100 included)23
PPMINMinimum price placed before the off (only bets of a payout of more than GBP100 included)17.5
IPMAXMaximum price placed in-play (only bets of a payout of more than GBP100 included)1000
IPMINMinimum price placed in-play (only bets of a payout of more than GBP100 included)19
MORNINGTRADEDVOLAmount traded before 11am GMT8902.98
PPTRADEDVOLAmount traded before the off42758.9
IPTRADEDVOLAmount traded in-play3419.04

The first four fields obviously relate to what will be stored in the EVENT and VENUE tables within the database. However, with the exception of the EVENT_ID field none of these are in the form required by the database:

MENU_HINT - Needs the country and date information removed to leave just the venue name.
EVENT_NAME - Contains both the race distance and race type. These need to be separated and placed within the corresponding fields of the EVENT table.
EVENT_DT - Contains both the date and time of the race which need to be separated and placed within the corresponding fields of the EVENT table. (Note that it is perfectly feasible to keep this data in the one field because databases essentially treat date and time data in the same way. This can involve extra work on the programming side when querying data so I prefer to split them up front. This is a matter of personal choice).

SELECTION_ID and SELECTION_NAME belong in the HORSE table but what about the rest? At first glance there doesn't seem to be a table for this information.

This is were we make use of the joining table EVENTDETAIL. Generally speaking, joining tables only consist of foreign keys, but as I alluded to in earlier posts, there is nothing preventing them from also acting as normal tables in their own right. Consequently, as these other fields are directly related to each runners Betfair performance in each race, the EVENTDETAIL table is the sensible place to put them.

As an aside, all the fields relating to traded volume and weighted average price can be ommitted as it wasn't part of the original specification. I cannot at the moment visualise how it can be used. However, I'm going to include those fields to give me the opportunity to carry out some analysis to see if there are any angles for new strategies.

That covers the information for each win market, but I also want to record some of the place market results. I'm not interested in the volume traded figures for the place markets, nor do I have any need for the max and min prices. Consequently, I'll be limiting place market information to the WIN_LOSE and BSP fields.

Note that the data files provide no direct link between the EVENT_ID for a win market and the corresponding EVENT_ID in the place market. This would need to be ascertained programatically by comparing the MENU_HINT and EVENT_DT information keeping in mind that occasionally there may be no corresponding place market available.

So, with all that in place, I can now present you with the ERD containing all the table fields to date:

There's still a little work to do with regard to the table fields. Specifically, the data types and sizes still have to be specified. In addition, it is worthwhile reviewing the design and identifying anything that may be missing. I'll do all that in the next installment.

Back Or Lay Picks 14th-16th Apr

The weekend proved to be something of a non-event with a tiny 0.07pt loss so it's now the turn of the mid-week games to pick things up a bit...


These picks should be backed if decimal odds at KO are 2.00 or over.
They should be laid if their odds are at 1.90 or under.

Back Or Lay Picks 10th-12th Apr

Only a handful for this weekend but hopefully these will do enough to recover the small loss from midweek matches.


These picks should be backed if decimal odds at KO are 2.00 or over.
They should be laid if their odds are at 1.90 or under.

Racing Database Redesign - Part 7 - Tables (continued)

Having identified issues with the initial ERD model it's now time to present you with my preferred option. As you'll see from the ERD below, this new arrangement reduces the number of tables required to six including the need for only one joining table. How was this possible?

If one considers the previous model, one would note that the JOCKEY, HORSE and TRAINER tables were all involved in many-to-many relationships with the EVENT table through corresponding joining tables. The data contained within those tables are inextricably linked if one views that data from the point of view of each individual event. Consequently, it is not unreasonable to consolidate those joining tables into one joining table which I've chosen to call EVENTDETAIL.

The interesting benefit of this arrangement is the elimination of the need to have a joining table between JOCKEY and HORSE tables (and the historical partnership issue that came with it). That partnership is now automatically maintained within the EVENTDETAIL table - as is the trainer/horse pairings as well. They are now both directly tied to the individual event concerned. Consequently, not only is the number of tables within the database reduced but the complexity of some of the querying that would need to be performed to establish those partnerships is also eliminated.

A number of points about the ERD before discussing it in more detail:

1. In line with what I was discussing last time, I've renamed the COURSE table as VENUE. Not only did it seem sensible but there's another reason as you'll see in a later post. You can of course call your tables whatever you like but it makes sense to call them what they represent.
2. I've started to finalise the design by incorporating the non-key field names for each table.
3. In the EVENTDETAIL table, I've chosen to follow strick procedure and show a joint-key between all four foreign key fields, i.e. EventID, HorseID, JockeyID and TrainerID. In fact only the first two are required to uniquely identify each record within that table, as discussed below.

While the above table arrangement represents what I want to model, there's always going to be an issue with historical data, specifically for trainers and jockeys. As I said earlier in this process, unless I can find a source from where I can pull that information in en masse then I'm not that concerned with modelling that aspect no matter how nice it would be to have it.

As I ultimately want to be able to use this database to guide my selection processes on the day of a race, that day's information can be retrieved programatically either through the API or web scraping techniques. It is purely the historical information that may cause issues. However, the absence of such historical data may cause a problem where the JockeyID and TrainerID fields within the EVENTDETAIL table can be empty.

Any key field within a table cannot be empty. However, those fields in their role within the EVENTDETAIL table are NOT key fields in and of themselves. As a result, depending on the database software you are using, you may find that individual entries within that field can indeed be empty. Keep in mind that having empty fields within a database still takes up storage space which is one reason why they should be avoided.

If you find that your database software complains if those fields are empty, you have a couple of options - well three if you decide to scrap the idea of recording trainer and jockey info.

1. Betfair assigns unique numerical codes to these items. These are usually six or more digits. Within each of the TRAINER and JOCKEY tables you could assign a dummy entry with the ID number of -1 say and the name of 'NA' and use that to represent all of the historical records that you are unable to get. As you build up those parent tables, the dummy entry will be used less and less.
2. Since the EventID and HorseID are enough of a joint key to uniquely define each record within the EVENTDETAIL table, one could simply remove the other two as being members of that joint key. Note that they still need to be contained within the table. This would allow the TrainerID and JockeyID fields to be empty.

I actually prefer option 1 or a combination of the two which would eliminate the possibility of those fields being empty in any event. As I said, check your database software as it may force you into one option or the other.

Next time, I'll look at finalising the table design by looking at what needs to be done to hold the Betfair race data.

Racing Database Redesign - Part 6 - Tables (continued)

I promised last time I'd introduce you to a second option; indeed, my preferred model. I'm going to renege on that promise for now as there are a couple of things I'd like to discuss in relation to the ERD I presented last time.

You may have seen the comment to the last post made by Dave regarding making a database '3nf'.

Dave was referring to the process of database 'table normalisation', specifically '3rd Normal Form'. As ever, you can find further information on my business website or feel free to google 'database normalisation'.

Suffice to say any database you design will be well served by getting your model into 3rd normal form. It ensures the integrity of the data contained within; eliminates repeating fields within tables; guarantees that each piece of data is stored in only one place; confirms tables only include data that belongs to the entity the table is describing and thereby makes accurate querying and data manipulation much, much easier.

Using Entity Relationship Diagrams as a technique not only forces you to think about the tables and relationships but you'll find that the design almost automatically falls into 3rd normal form. It certainly makes it easier to spot when it isn't.

Going back to the ERD presented last time, and not withstanding the issue with past jockey/horse partnerships, you'll find that the database is in third normal form. This leads me on to another important point about the normalisation process... It makes the expansion of your database beyond the remit of your original specification much easier without the need for a major redesign.

By way of example, at some later date I may decide I wanted to record further information about each track at each course. Off the top of my head: number of jumps; space between jumps; length of run in from last jump; is the course undulating or flat; is the run in up hill and so on. Or perhaps I might want to record information about a horse's breeding or detailed information about a trainer's yard.

To incorporate that data it is simply a matter of adding the additional tables required and creating the necessary relationships between them and the existing tables. The current data remains intact and no redesign is required to the existing structure.

Finally, let me revisit the jockey/horse partnership issue from the current model.

As indicated last time, the best way (I say best way in the context of the current ERD and the fact I can think of no other way within those confines) to handle it would be to treat it in the same way as the trainer/horse partnership and incorporate some date information within the joining table, HORSEJOCKEY. Those dates would reflect the date each partnership took place, i.e. the date of the race in question and therein lies the rub.

That date information is already contained within the EVENT table and incorporating it elsewhere would, while allowing the functionality I'm looking for, create a cardinal sin by storing the same data in multiple places within the database. This can lead to additional work in both programming the database as well as its operation but more importantly, it can lead to errors creeping in when recording and querying data. Every effort must be made to avoid that situation arising by eliminating the possibilty at the design stage. In otherwords, you should always ensure that data is recorded in one place and one place only.

That now leads me to the end of this little diversion and back on track. Next time I'll take a look at the revised ERD I've been promising you.

Back Or Lay Picks 6th-9th Apr

Sorry for this being a little late but you've still got time to follow the ones playing today...


These picks should be backed if decimal odds at KO are 2.00 or over.
They should be laid if their odds are at 1.90 or under.

Stephen Lohoar - Showjumper

Currently competing at the Scottish National Equestrian Centre, where my daughter is working.

This ignorant fuck has blocked my daughter in with his horse lorry for over 6 hours. Despite being asked on numerous occasions to move it, he refuses to do so. He simply jumps on another horse and rides off again.

Unfortunately I cannot find a twitter account for him otherwise I would've contacted him directly.

Some people just cannot help being utter arseholes.

Racing Database Redesign - Part 5 - Tables (continued)

Last time I left you to mull over what might be wrong with the ERD I had left you with. In particular, I queried the inability to marry a jockey/horse partnership to the EVENT table.

Hopefully, you'll have spotted the big clue I left. The ERD as presented is incomplete in that there should be another relationship in there.

Just as a list of horses running in an event can be ascertained through the relationship between the HORSE and EVENT tables, a similar many-to-many relationship exists between the JOCKEY and EVENT tables. That is to say, any one jockey can ride in a number of events and any one event clearly has any number of jockeys taking part.

Once that is established, a query can be run on the HORSE - EVENT relationship to establish a list of horses; a query can be run on the JOCKEY - EVENT relationship to establish a list of jockeys; and finally, the jockey/horse partnerships can be established by comparing those two lists through a query on the JOCKEY - HORSE relationship.

Exactly the same situation exists with regard to the TRAINER table. A many-to-many relationship between the TRAINER and EVENT tables must be catered for.

Those changes are illustrated in the revised ERD below in which you'll notice two things:

1. The number of tables required has doubled due to the number of many-to-many relationships that need to be handled.
2. I've started to label each table with field names. For now, I'm limiting myself to listing 'Key Fields' and those involved in the creation of the relationships.

Again, I'd point you to the tutorials on my business website for more details but it's worthwhile giving a brief explanation at this point regarding 'Key Fields'.

Put simply, each record within a database table must be unique. It is the job of the table key to provide that functionality. If one looked down the key field (or column) in a table, no entry will ever be repeated.

In an ERD, I show the keys in bold type. Relationships are generally established between table fields containing the same data. To enable that to happen, the key field from the ONE side of the relationship is exported to the table on the MANY side of the relationship as a 'Foreign Key'. This name is perhaps a little misleading in that 'Foreign Key' is NOT a key field within the new table. You'll see the CourseID field in the EVENT table is not shown in bold but its presence is necessary to establish the relationship between the COURSE and EVENT tables.

Now, consider the joining tables used to model the various many-to-many relationships by taking the JOCKEYEVENT table as an example:

These have two fields both of which are embolden because AS A PAIR they make up a unique table key. Individually however, they are not unique. Each EventID will be repeated a certain number of times equal to the number of runners in that event. Each JockeyID will repeat equivalent to the number of races he/she has participated in to date. But taken together, there will be no repeating combinations.

Moving back to the model as it now stands, there is still a problem with it...

Imagine for a particular race we wanted to list the horses taking part together with trainers concerned. If a particular horse was competing in a race where its previous trainer also had a runner then any query done on the current model would list both those partnerships.

To get around that, a record of when a horse was transferred to a new trainer would have to be included in the database. The easiest thing to do would be to put that date information into the HORSETRAINER joining table. Any subsequent query would simply ask for the most recent combination in order to eliminate the old partnership from the list.

Whether that data is readily available may be an issue for the purposes of this database.

The same problem rears its ugly head elsewhere. When trying to list the jockey/horse combinations for a particular race it is quite feasible that a horse may be up against a previous jockey riding another horse. This one isn't quite so easily catered for as the horse/trainer example above.

In my last post I indicated that there were a couple of solutions to the initial model proposed. (There may well be others that I haven't considered). Next time, I'll introduce my preferred method that will both reduce the number of tables required and, once the database is operational, make querying the data easier.

In the meantime, if you have any questions or if you are finding my ramblings even a little bit useful then please leave a comment below.

Back Or Lay Picks 4th-5th April

A busy Easter weekend coming up but only four selections for consideration...


These picks should be backed if decimal odds at KO are 2.00 or over.
They should be laid if their odds are at 1.90 or under.

Wading Through Treacle

As regular readers will know, all my horse race trading tends to be in-running but this new Betfair API continues to make that more and more difficult.

I've done three races today so far and it's only 14:37 as I type this. Every race has been incredibly sticky with Fairbot not refreshing for up to 10 secs. It is virtually impossible to trade when the API is behaving like this.

The new API has been with us for nearly six months and these issues have been reported to Betfair by loads of traders yet they seem incapable or unwilling to resolve them. It really isn't acceptable. I'm only a small player at this. I hate to think how the professionals are feeling.

Here's a short video I just recorded showing the 14:40 at Towcester on 1st April. April Fool's day indeed.

Racing Database Redesign - Part 4 - Tables (continued)

Let me start by saying if anyone has any questions or would like to exchange ideas with me or anyone else that is reading these tutorials then feel free to post a comment.

In part three of this series, I started to introduce the tables and discuss the relationship they have with each other. Hopefully you've taken the opportunity to look at the tutorials on my business website which will give you a basic insight into database table design.

The initial ERD I introduced last time is deliberately flawed so that I could illustrate the important link between the objectives and specification of the project; the data modelling presented within the table design and the internal relationships with the data being modelled.

If you recall, the 1st draft ERD indicates a number of many-to-many relationships within the design. This type of relationship is not unusual in a database but they need to be modelled differently because Relational Database Management Systems (RDMS) cannot handle them efficiently. The solution is quite simple however. Each many-to-many relationship can be modelled by splitting the relationship into two one-to-many relationships between the existing tables and a new intermediary table known as a 'joining' table.

The ERD diagram below represents the new arrangement where each joining table contains the key fields from corresponding tables that 'feed' it.

The presence of many-to-many relationships within the model as it stands is not an issue. I'm more interested in the relationships between jockeys, trainers and horses. First, ignore what you see in the above diagram and simply consider the relationship between jockeys and horses.

It is reasonable to assume a relationship between the two exists in that any one jockey can ride any number of horses in a career lasting many years and any one horse could be, over its career, ridden by a number of different jockeys. This is the very definition of a many-to-many relationship. Yet does this relationship fit in with the objectives of the project? How does one go about listing the jockey and horse partnership in any one event?

Clearly, for any given event, the list of horses can easily be obtained thanks to the relationship that exists between the EVENT table and the HORSE table. One might then think that the relationship between the HORSE table and the JOCKEY table gives us access to the jockey information. It does, in that it lists all those jockeys who have every ridden those horses. Unfortunately, because there is no direct link between the EVENT and JOCKEY tables it is impossible to assertain which jockey was riding which horse in any one particular event.

If you recall from the last post, the EVENT table consists of information about each race, namely Betfair's unique ID number (EventID) as well as date and time information. We need a way to relate the jockey table to that event information before jockey/horse partnerships for any particular event can be identified.

Now a very similar argument exits for the relationship between the HORSE and TRAINER and its resolution will use the same technique as that used between HORSE and JOCKEY.

Such is the way of things in the world of database modelling that there is not necessarily a single solution. In this case there are a couple of solutions that spring to mind. With that said, I'm going to finish this post here and give you the opportunity to mull over what those solutions might be and I'll address them next time.

Racing Database Redesign - Part 3 - Tables

Before I get to the details of the database tables I should point out that I won't be discussing the technical aspects of database table design here. However, I will be using terms like Entity Relationship Diagram (ERD), one-to-many and many-to-many relationships, attributes and data types. If you wish to learn more about these things (and I humbly suggest you may find it useful not just here but for any other databases you may be designing) then let me point you to a series of tutorials I wrote on my business website.

Now that that is out of the way, what about the horse racing database?

In actual fact, the database is a very simple one and needs only a handful of tables to hold the information I want. Even then, it may not need everything described below. If you recall from the post where I outlined the specification there are some things I'd like to include but only if I can find an online resource that allows me to pull that information in en masse. Otherwise, I do not wish to waste my time with it. For now, I'll include it in the model and I'll flag up those areas that fall into this category.

Below is a list of table names and their corresponding attributes that describe each item held within the corresponding table. At the moment, there is no attempt to fit it into a Betfair paradigm; I'm simply establishing the tables, the data they hold and how they relate to each other.

1. HORSE: HorseID, Name, DoB, Colour, Sex, RunningStyle
2. COURSE: CourseID, Name, Location, LayoutURL
3. EVENT: EventID, Date, Time, RaceDistance, RaceType, Going
4. JOCKEY: JockeyID, Name, DoB, Sex
5. TRAINER: TrainerID, Name, Location

Those are the main tables. There will be one or two others which I'll introduce and explain at a later date. They'll be used as temporary tables on a day-by-day basis and are not necessary to define the database itself.

Of those five tables, the last two are the most problematic. I've yet to find an online resource that will allow me to build the contents of these tables quickly and automatically as well as relate them to the corresponding Betfair ID numbers. It may be that the API is my best bet.

The attributes from those tables and some of the others are available from the timeform website, so that could be scraped each day. The feasibility of doing that for historical data going back to 2009 is open to question. I've still to research the best way to obtain the historical data that isn't included in the results files downloadable from Betfair - assuming that it is possible. The data obviously exists, but whether Joe Public can get at it is another matter.

The first draft Entity Relationship Diagram below shows the relationship between all five tables. The first thing to note is the three many-to-many relationships. These cannot be handled easily within a relational database model.

If you take the time to follow the database tutorials on my business website, and then refer back here, see if you can work out why I've created the relationships in this way.

This won't be the final solution - far from it. Over and above the many-to-many relationship issues outlined above there are additional problems with this initial model which I'll discuss in subsequent posts.