Financial Python

Studies in Finance and Python

Posts Tagged ‘code

Use python and sqlite3 to build a database

with one comment

The previous post showed how matplotlib can pull market data from Yahoo. Using some python-fu, you can easily create CSV files for given stocks. CSV files are great as they are easy to parse and don’t require a lot of overhead (in terms of setting things up, you can just open the file directly). Of course, Yahoo, Google, and others offer decent tools to view stock prices, so creating CSV files might be a bit redundant. Furthermore, if you really want to go whole hog and pull prices for thousands of assets, I imagine one would have to think carefully about an appropriate filesystem.

On the other hand, python includes a built in wrapper for sqlite, the ubiquitous file-based database that’s embedded in so many mobile and desktop applications. Sqlite can be run on any platform and is a completely open source, flyweight piece of software (which explains it’s popularity). This post will outline how to create a simple two table sqlite database to store stock prices.

First if all, sqlite has it’s own interpreter you can access from the command line, so if you’re an sql cowboy, you can instantly execute sql statements from the prompt to explore a given database. Here are a few examples:

The .help command lists some of the sqlite specific commands available at the command line.
sqlite_help
The .schema command prints the tables in the database (just as you might think).
sqlite_schema
You can also input sql statements directly.
sqlite_select
Anyway, enough about sqlite itself, let’s get to the python that built the database above. The attached script used four functions to package data for a defined set of tickers, using quotes_historical_yahoo(), and inserts the data into an sqlite database. The key is using a dictionary of tuples to pass around data.

stocks = {'data':[('GOOG', 'stock'), ('AAPL','stock')],
 'headers':('ticker', 'tag')}

In the example above, the ‘data’ key references a list of tuples that provide information on the ticker and type of asset in the tuple. The ‘header’ key references the table field names (as defined in the database schema) associated with the ‘data’ key. The same structure is used to write market data to the database, where the ‘data’ key refers to numeric data (e.g., foreign keys, date string, open, close, etc.) and the ‘header’ key refers to the formal fields to which each piece of data is associated.

raw_quotes = quotes_historical_yahoo(ticker, start, end) #list of tuples
data = []
for quote in raw_quotes:
 date_raw = datetime.datetime.fromordinal(int(quote[0]))
 year, month, day = date_raw.year, date_raw.month, date_raw.day
 date_string = str(year)+'-'+str(month)+'-'+str(day)
 record = (f_key, quote[0], date_string, year, month, day,
 quote[1], quote[2], quote[3], quote[4], quote[5])
 data.append(record)    

headers = ('asset_id',
 'gregorian_day',
 'date_string',
 'year',
 'month',
 'day',
 'open',
 'close',
 'high',
 'low',
 'volume')
return {'data':data, 'headers':headers}

I use a fair amount of string substitution in the code, which isn’t strictly recommended, but I’m not particularly worried about security. Anyway, the write2sql() function needs to know how many fields are associated with the data. Rather than create two separate functions for adding stocks to the asset table and market data to the prices table, I used string operations to expand or contract the substitution marks to match the number of headers. I’m sure there are more elegant ways to do it, but, hey, it works.

Finally, I’ll plug the Sqlite Manager add-on for Firefox. It’s a basic and easy way to inspect sqlite files.

SQLiteManager

Here’s the complete code: Read the rest of this entry »

Written by DK

August 5, 2009 at 3:36 am

Posted in Finance, Python

Tagged with , , , ,

Python Gmail Script

leave a comment »

Click here to download:

emailscriptexample.py (1 KB)

Sorry, forgot to attach the code for the previous post. Here it is.

Mirror at NotesToSelf

Written by DK

June 13, 2009 at 8:42 pm

Posted in Python

Tagged with ,

Update to Google Apps Script

with one comment

The Google guys finally updated the python client to reflect the addition of Groups to Google Apps. Groups allow admins to set permissions on distribution lists (for example, only list owners can send email to the list, etc.) Anyway, they basically added another “groups” service that allows the user to access Group objects. The group feed objects, however, are translated into regular python dictionaries for you and are not XML feeds. I’ve attached an updated python script (see this post for the original) that reflects the change.

Read the rest of this entry »

Written by DK

May 9, 2009 at 10:08 pm

Posted in Python

Tagged with , ,

Using Google Apps Python Provisioning API

leave a comment »

I’ve gotten my church staff to use Google Apps for email and collaboration (sort of). Unfortunately, the Google Apps admin web tool is a bit inflexible (though it’s improving). In an effort to get more control over the data, I put together a little script that pulls all the users, distribution list names, and distribution list constituents into a google spreadsheet. The program uses two nice python libraries, xlrd and xlwt, to read and write to intermediary excel files since it’s not possible to directly create google spreadsheets (the workaround I used is to upload an xls file).

The documentation for xlrd is quite good. The documentation for Google’s Python Provisioning API is ok, but not particularly well organized. For example, I was trying to figure out the attributes for the different XML feed objects and was told no documentation existed (by a Google employee, no less). After hours of trial and error, I found the PrintFeed function below in a separate piece of Google documentation that made it clear it was possible to use a generic attribute. I accept the possibility I was being a idiot. Anyway, the documentation for xlwt is not good, but the code below should give you a decent start and a few fruitful search terms.

The script basically makes a spreadsheet with a summary tab that lists the name of each distribution list and the number of members in each list. It also creates a tab for each distribution list containing the email addresses of the list. Pretty basic, but handy if people (that aren’t admins) need to know what email lists are available (and their constituents).

I’m sure there are better ways to code this, but what the heck, it worked for me. But I make no guarantees! (UPDATE, see post with updated code.)

Written by DK

March 11, 2009 at 10:39 pm

Posted in Python

Tagged with ,