When using pythons sqlite3 module, if I were to create a table and the first row had say 4 columns would the next row have to have 4 columns or could I have more/less?
I’m looking to create a database of vocabulary words. Each word may have a varying number of definitions.
For example ‘set’ would have many more definitions than ‘panacea’.
I would work this vocabulary database in with a scraper that could easily look up the word and definition from a dictionary-reference site.
#! /usr/bin/env python
import mechanize
from BeautifulSoup import BeautifulSoup
import sys
import sqlite3
def dictionary(word):
br = mechanize.Browser()
response = br.open('http://www.dictionary.reference.com')
br.select_form(nr=0)
br.form['q'] = word
br.submit()
definition = BeautifulSoup(br.response().read())
trans = definition.findAll('td',{'class':'td3n2'})
fin = [i.text for i in trans]
query = {}
for i in fin:
query[fin.index(i)] = i
## The code above is given a word to look up and creates a 'dict' of its definiton from the site.
connection = sqlite3.connect('vocab.db')
with connection:
spot = connection.cursor()
## This is where my uncertainty is. I'm not sure if I should iterate over the dict values and 'INSERT' for each definition or if there is a way to put them in all at once?
spot.execute("CREATE TABLE Words(Name TEXT, Definition TEXT)")
spot.execute("INSERT INTO Words VALUES(word, Definition (for each number of definitions))")
return query
print dictionary(sys.argv[1])
This isn’t an assignment but, more of a personal exercise for learning sqlite3.
Your design goes against the spirit of relational databases (where Wikipedia defines a relation as “a set of tuples that have the same attributes“), of which sqlite is one.
The appropriate design here is a table for words and a table for definitions, linked by a foreign key. If your word has no other attributes besides its content, you can get by with skipping the words table and just using the keys from the definitions table.
Note, however, that you’ll have one row per definition, not one per word.