True, but the question is what option performs better. In that case, there is no doubt about it: option # 1 will perform better due to the query not having to JOIN with any other tables. Randolph does have a good point, whenever possible you should normalize your database structure.
Thank you for your quick response. I will read articles about normalize database structure as I do not understand the concept very much as of now. – Aktee Oct 15 '09 at 7:08 +1.
Option 1 will definitely be faster. (But that doesn't mean it's good). It would sure help you, Aktee, to know the pros/cons of normalization (I've learned this the painful way :D) – mives Oct 15 '09 at 7:10 Thank guys, I will use the normalized database.
From what I read, I don't think the performance gain is worth the frustration. – Aktee Oct 15 '09 at 7:22.
If you are not experienced with database design, I'd suggest to always go with the normalized version. It's the right thing to do in most cases. You might want to denormalize your database in some cases, but then you should know exactly why are you doing that.
Note that in the second case it's not multiple queries. It's just one query, where all the tables are joined together. For example: SELECT * FROM restaurant JOIN city ON city.Id=restaurant.
City JOIN province ON province. Id=city. Province ... Yes, it takes longer to write, but it's better than having inconsistent data in the database (maintaining a denormalized database is way harder).
You can also use an ORM to do this kind of stuff for you.
Thank you. I think I will go with scenario #2 then (If I understand the concept of normalized database design). – Aktee Oct 15 '09 at 7:12.
The second option is a normalised structure, which means your data is less redundant, less chance for making errors, etc. I always vote for normalising data unless you're going to run into performance problems. Incidentally, SELECT * FROM Table isn't good practice anyway. You'll want to put in the column names.
As for Select *, thank you, while I was waiting for responses (quite fast, actually! ), I just read that select * is very bad. – Aktee Oct 15 '09 at 7:10.
Thank you guys for your input. "Normalized Database Design" was the key here. I googled it, speed-read it, and although it has a little bit less performance, the pros are really worth it.
Thanks again. (That was really fast btw! ) en.wikipedia.org/wiki/Database%5Fnormali... Wikipedia states that denormalized has a better performance, but I think I am just getting cocky and thinking I can handle a big denormalized database.
I'll stick with the less risky scenario. If shits hits the fan, I'll change hardware =). Thanks again guys.
If you use the first scenario you get the problem of increased space use (for all the duplicate province, country, continent) and if you need to change the name of a city/country you need to change it in all rows where it's used. For convenience I would use the second scenario. I don't think there will be big performance differences between the two scenarios (in the first scenario you only touch one table, but read back more data from disk, in the second scenario you read less data from the disk, but from multiple tables).
It really depends on what kind of data you have there. Edit: To explain my point above: if you keep all data in a large table then you need to actually read all the rows from the disk, even if much of the data read is the same (namely the city, province, country, continent). Even if the SQL caches data as it can it won't help here since it can't know that data from other rows is the same.
If you normalize the database and read from the restaurant table you will get ID's for the cities. Now if you have the same ID on multiple rows the SQL server will cache the data read for the city and won't hit the disk again, so it will be an increase in speed. This will be offset by the need to access a new table, but with correct indexing on the city ID that shouldn't be too much.
That's why I'm saying that with large databases the performance difference is not easy to assess and you'll be better off having a properly normalized DB. And yes, if you use a normalized DB (second scenario) you can change the city name in one place since there will be a single row for a city. The same will work for the others (province, country, continent).
Can you explain or give me a link about "Read back more data from disk" and "read less from disk but from multiple tables". It is quite confusing for me to understand. The kind of data is text.
Although I have to admit it is a big database. – Aktee Oct 15 '09 at 7:07.
I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.