Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
Random
Starting Member
3 Posts |
Posted - 2011-09-27 : 21:39:49
|
| I have been using SQL for a while, but it has been mostly basic database management (address books, shopping carts etc.) I am starting a new project and I'm having trouble thinking of how to design the database, ie what tables and columns to have and the best way to connect them.Basically, I am making a parser to parse strings of words. I want a database to store the following:How often each word is followed by a specific other word?For example, The string "The Fox Is The Fox with the most." I want to be able to select a word in the database (say "The") and find out how many times various other words followed it. So I should be able to query the database "How many times was "The" followed by "Fox" and it would return twice. Then "How many times was the word "Is" followed by "The" and it would return once. Lastly, "How many times was "The" followed by "Horse" would return none.The problem I'm having is how to setup the tables so the tables can grow as the parser reads new words? My first attempt was the following table:ID | Word | Following Word | Times Followed----------------------------------------1 | The | Fox | 22 | The | Most | 13 | Fox | Is | 14 | Fox | With | 1 . .And use something like: SELECT times_followed FROM word_table WHERE word = 'The' AND following_word = 'Fox'...But this seems like it may waste an awful lot of space. The word 'The' is going to be followed by an awful lot words. Not to mention, this keeps the entire collection of data in one table. It seems like breaking up the tables may be better, but I can't think of how. Maybe a table with the word "The" and a list of IDs pertaining to following words that point to another table of how many times each was preceded by "The"???? Are you confused yet?Can someone please help? |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2011-09-27 : 23:03:58
|
| Maybe a table with the word "The" and a list of IDs pertaining to following words that point to another table of how many times each was preceded by "The"i think your above approach might be most flexible one as you dont have to worry about how many words get associated to particular word or how many new words came into being. so i hope this should suffice------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/ |
 |
|
|
Random
Starting Member
3 Posts |
Posted - 2011-09-27 : 23:22:57
|
| I guess I knew it but didn't believe it. I wrote it out again and it worked this time. Thanks for giving me the confidence!!! |
 |
|
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2011-09-27 : 23:25:02
|
no problem. you're welcome ------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/ |
 |
|
|
Lumbago
Norsk Yak Master
3271 Posts |
Posted - 2011-09-28 : 01:57:00
|
| Can I ask what this is really for? And how/when do you update all the counts? If you have a decent amount of strings, are you counting these on the fly or do you store the counts?- LumbagoMy blog-> http://thefirstsql.com/2011/07/08/how-to-find-gaps-in-identity-columns-at-the-speed-of-light/ |
 |
|
|
|
|
|