|
Sphinx
Community
Services
Misc
Subscribe in a reader
|
Relational SQL Data Revisited
Common forum |
1 | 2 | 3 | 4 | 5 | ... |
263 | 264 | 265 | 266 | next »» | Create new thread
|
shanky
Name: Steven Smith Posts: 4 |
2008-01-30 22:57:51
| reply!
[I hope this is not too confusing.]
I think Sphinx will fit-the-bill on my search needs based on my readings and perusing of
the forum posts but I want to make sure before going through with a full install.
I will have a table of EntityA entities that will contain its ID and some attributes
particular to such entities (A). I will then have another table of EntityB entities that
will contain its ID and attributes particular to such entities (B).
The part that is hazy in how Sphinx will index and return results is that each EntityA
(through a link/xref table) will have a variable number of EntityBs associated with it.
Of particular interest is specifying multiple keyword searches where you want to find
EntityAs where its EntityBs match on those multiple keywords while still possibly
matching on some of the particulars of EntityA itself.
Thusly if I want to search for EntityAs having EntityB that can match on keywords
"keyword1 keyword2 keyword3" (with the quotes used in the search to associate them as an
actual phrase like "big blue rectangle") but also enter the term "impressionist" from
EntityA's particulars, are there easier search terms to enter for the end user rather
than having them enter the search as:
@EntityB:"big blue rectangle" @EntityA:impressionist
Although, that is probably incorrect as I would probably have to have them mention the
"field(s)" to search on (i.e. @field_name1:...) which I can see having to use some form
of concatenation if EntityB's attributes as they are linked/xref'ed to a generic
attribute table (i.e. EntityBs may only have a subset of a few attributes from a generic
attribute table of many, many entries).
This would be the much preferred method for the user to enter search terms (without alot
of programmatic manipulation of the search terms prior to searching):
"big blue rectangle" impressionist
Maybe I am overthinking this but I just want to be certain I can search multiple terms of
attributes for EntityB while pulling the parent entity as the search result.
What would be the ideal configuration for such (with weightings if needed for EntityAs)?
Schema Mock-up
==============
EntityAs
--------
A_ID
Name
Genre
EntityBs
--------
B_ID
Name
Section
EntityA_to_EntityB (link/xref: technically many(A)-to-many(B) but mostly 1(A) to many(B))
------------------
A_ID
B_ID
GenericAttributes
-----------------
GA_ID
Name
EntityB_to_GenericAttribute (link/xref: many-to-many)
---------------------------
B_ID
GA_ID
Thanks in advance.
|
 |
|
Kiamo
Name: Justin Posts: 56 |
to: shanky, 2008-01-30 23:03:49
| reply!
> [I hope this is not too confusing.]
>
> I think Sphinx will fit-the-bill on my search needs based on my readings and perusing of
> the forum posts but I want to make sure before going through with a full install.
>
> I will have a table of EntityA entities that will contain its ID and some attributes
> particular to such entities (A). I will then have another table of EntityB entities that
> will contain its ID and attributes particular to such entities (B).
>
> The part that is hazy in how Sphinx will index and return results is that each EntityA
> (through a link/xref table) will have a variable number of EntityBs associated with it.
> Of particular interest is specifying multiple keyword searches where you want to find
> EntityAs where its EntityBs match on those multiple keywords while still possibly
> matching on some of the particulars of EntityA itself.
>
> Thusly if I want to search for EntityAs having EntityB that can match on keywords
> "keyword1 keyword2 keyword3" (with the quotes used in the search to associate them as an
> actual phrase like "big blue rectangle") but also enter the term "impressionist" from
> EntityA's particulars, are there easier search terms to enter for the end user rather
> than having them enter the search as:
>
> @EntityB:"big blue rectangle" @EntityA:impressionist
>
> Although, that is probably incorrect as I would probably have to have them mention the
> "field(s)" to search on (i.e. @field_name1:...) which I can see having to use some form
> of concatenation if EntityB's attributes as they are linked/xref'ed to a generic
> attribute table (i.e. EntityBs may only have a subset of a few attributes from a generic
> attribute table of many, many entries).
>
> This would be the much preferred method for the user to enter search terms (without alot
> of programmatic manipulation of the search terms prior to searching):
>
> "big blue rectangle" impressionist
>
> Maybe I am overthinking this but I just want to be certain I can search multiple terms of
> attributes for EntityB while pulling the parent entity as the search result.
>
> What would be the ideal configuration for such (with weightings if needed for EntityAs)?
>
>
> Schema Mock-up
> ==============
>
>
> EntityAs
> --------
> A_ID
> Name
> Genre
>
>
> EntityBs
> --------
> B_ID
> Name
> Section
>
>
> EntityA_to_EntityB (link/xref: technically many(A)-to-many(B) but mostly 1(A) to many(B))
> ------------------
> A_ID
> B_ID
>
>
> GenericAttributes
> -----------------
> GA_ID
> Name
>
>
> EntityB_to_GenericAttribute (link/xref: many-to-many)
> ---------------------------
> B_ID
> GA_ID
>
> Thanks in advance.
>
Will an entity B always only have one entity A associated with it, but an entity A can
have multiple Bs?
If so, you could accomplish this what seems fairly easily with an SQL JOIN statement in
Sphinx's configuration for retrieving the data to index, unless I misunderstood something.
Another option is two indexes, one which has all of the information for the As and
another with the information for the Bs. Doing a query on all indexes (*; the default)
will return information from both, but this may be somewhat confusing due to IDs and not
knowing for sure which is which without additional work.
|
 |
|
shanky
Name: Steven Smith Posts: 4 |
to: Kiamo, 2008-01-30 23:26:23
| reply!
> Will an entity B always only have one entity A associated with it, but an entity A can
> have multiple Bs?
> If so, you could accomplish this what seems fairly easily with an SQL JOIN statement in
> Sphinx's configuration for retrieving the data to index, unless I misunderstood something.
>
> Another option is two indexes, one which has all of the information for the As and
> another with the information for the Bs. Doing a query on all indexes (*; the default)
> will return information from both, but this may be somewhat confusing due to IDs and not
> knowing for sure which is which without additional work.
I haven't formalized the EntityA to EntityB relationship yet. An EntityA can have
multiple EntityBs for sure. Depending on how unique EntityBs turn out, then there could
be re-use of an EntityB with other EntityAs (i.e. the attribute "fingerprint"
possibilities of EntityBs may warrant a many-to-many relationship to conserve space but I
am thinking the possible attribute count will be quite high so probably easier to have
seperate Bs for each A).
Thanks for the prompt reply by-the-way.
|
 |
|
eweaver
Name: Evan Posts: 77 |
to: shanky, 2008-01-31 00:31:02
| reply!
> I haven't formalized the EntityA to EntityB relationship yet. An EntityA can have
> multiple EntityBs for sure. Depending on how unique EntityBs turn out, then there could
> be re-use of an EntityB with other EntityAs (i.e. the attribute "fingerprint"
> possibilities of EntityBs may warrant a many-to-many relationship to conserve space but I
> am thinking the possible attribute count will be quite high so probably easier to have
> seperate Bs for each A).
>
> Thanks for the prompt reply by-the-way.
>
Every Sphinx index record is flat. Whatever you return from SQL is what gets indexed. You
can use GROUP_CONCAT (or some other grouping query) to load from M-M joins. If you are
trying to conserve space in the Sphinx index, it can't really be done, but that is
probably the least of your worries.
What's your actual use case?
|
 |
|
shanky
Name: Steven Smith Posts: 4 |
to: eweaver, 2008-01-31 02:38:55
| reply!
> Every Sphinx index record is flat. Whatever you return from SQL is what gets indexed. You
> can use GROUP_CONCAT (or some other grouping query) to load from M-M joins. If you are
> trying to conserve space in the Sphinx index, it can't really be done, but that is
> probably the least of your worries.
>
> What's your actual use case?
Yeah, I just got through reading about GROUP_CONCAT and then noticed it in that other
post on relational data in this forum. I think that will allow the EntityB terms
(concatenated) to be searchable as part of the EntityA they belong to. Sound about right?
And you use of "flat" does make it a bit simpler to think about.
Thanks all.
|
 |
|
shodan
Name: Andrew Aksyonoff Posts: 4117 |
to: shanky, 2008-02-03 21:30:46
| reply!
> I think that will allow the EntityB terms (concatenated) to be searchable as part of the
> EntityA they belong to.
Right. GROUP_CONCANT is as if you index EntityA table but append another column with the
concatenation of related rows from EntityB.
Alternatively you can also index EntityB on itself with parent EntityA ID stored as an
attribute, and use group-by.
|
Common forum |
1 | 2 | 3 | 4 | 5 | ... |
263 | 264 | 265 | 266 | next »» | Create new thread
|