sql - What's the best way to have a calculated column that involves multiple tables in a transactional database? -
what's best way have calculated column involves multiple tables in transactional database?
background
our transactional mysql database includes fact tables following simplified model:
create table lu_project( project_id int, project_desc varchar(60) ); create table f_project_budget( project_id int, budget numeric(15, 2) ); create table f_task_budget( task_id int, budget numeric(15,2) ); create table f_money_used( reported_time datetime, task_id int, money_used numeric(15, 2) );
task
child of project
. task ids not unique, project/task pairs unique.
requirements
we need preserve invariant total budget tasks project <= budget project.
in addition, need run query returns result set following columns:
project_id
, projet_desc
, project_budget
, task_count
, project_money_used
issues
we're concerned performance. simplest solution require updates hit 3 fact tables:
- check adding
f_money_used
wouldn't put on budget task inf_task_budget
. - check adding money wouldn't put on total budget project
- write
f_money_used
.
and query need stats hit 3 fact tables (in addition lookup tables, not in model above):
- join
lu_project
project_budget
f_project_budget
groupproject_id
- join
lu_project
f_task_budget
task_count
groupproject_id
- join
lu_project
f_money_used
project_money_used
groupproject_id
- join
lu_project
intermediate results above ,project_desc
well.
the problem that's lot of joining , both reads , writes happen frequently.
potential solution
one solution considering add project_money_used
field f_project_budget
updated on writes f_task_budget
. slow down writes speed reads.
the solution introduce complexity , data integrity concerns, since fact tables no longer "at grain." violates data warehousing principles, can't find out whether it's kosher transactional databases.
the slowdown in writes may not of concern if can optimistic rendering in ui, introduces more complexity.
other solutions considered
- for writes, considering using triggers preserve invariant.
- for reads, calculated columns looked promising, aren't allowed hit multiple tables in mysql.
- for reads, materialized views not option, data needs fresh in real time.
summary
is there better solution doing our reads , writes in safe, simple, , performant manner?
Comments
Post a Comment