This commit adds a rewrite rule for ebook downloads of the form:
```
/ebooks/some-author/some-book/downloads/some-filename.epub
```
to `www/ebooks/download.php`. That file handles the logic of whether to
show a thank you page before beginning the download.
Download URLs in RSS, Atom, and OPDS feeds follow the same pattern, but
they have a query string parameter `?source=feed` to always skip the
thank you page.
See #484 for details. By adding a special case for hyphens, users can
search for these terms:
`beta` to match `Alpha-Beta`
`queen` to match `Haycraft-Queen`
These searches also work as expected:
`Alpha-Beta`
`Alpha`
`Haycraft-Queen`
`Haycraft`
I don't think these queries should work, and they do not:
`AlphaBeta`
`HaycraftQueen`
This commit changes `IndexableText`, `IndexableAuthors`, and
`IndexableCollections`, so existing DBs need an update. This will update
all published books:
```
cd /standardebooks.org/ebooks
for BOOK in $(find /standardebooks.org/ebooks -maxdepth 1 -type d)
do
tsp nice /standardebooks.org/web/scripts/deploy-ebook-to-www --verbose --no-build --no-images --no-recompose --no-epubcheck --no-feeds --no-bulk-downloads "$BOOK"
done
```
And this PHP code will update placeholders:
```
<?
require_once('/standardebooks.org/web/lib/Core.php');
$ebooks = Ebook::GetAll();
foreach($ebooks as $ebook){
if($ebook->IsPlaceholder()){
print('Saving ' . $ebook->Identifier . "\n");
// Need to force `Ebook::GetAllContributors()` to be called before `Ebook::Save()`. Otherwise, authors and translators will be deleted.
$ebook->Authors;
$ebook->Save();
}
}
```
That is, don't replace non-alphanumerics with a space. This matches the
behavior of Formatter::RemoveDiacriticsAndNonalphanumerics(), which
would be used here except that function would also remove quotes. We
actually discussed not introducing spaces previously, but I made a
mistake and didn't apply the same change to the user's search query:
https://github.com/standardebooks/web/pull/470#discussion_r1929591492
This change fixes queries with compound words like:
`Haycraft-Queen`
and also fixes queries for authors with apostrophes:
`O'Neill`
No changes to existing DBs are necessary because they already have terms
like `haycraftqueen` and `oneill` stored in `IndexableCollections` and
`IndexableAuthors`.
* Replace `EbookUrl` with `EbookId` in `Artworks`
* Add a `FullUrl` member to `Ebook`
Add documentation about when to use it versus Url.
The full URL is also being used as an ID in RSS feeds, so use `FullUrl` there.
* Store an `EbookId` in `Artworks`
The `Validate()` method is correctly setting it to null, but then the
`UPDATE` SQL statement is triggering another call to
`GetIndexableText()`. Without this change, empty strings are being
written to the `IndexableText` column.
This allows the user to run a keyword search and then change the sort
order. `Default` is interpreted as `Relevance` if a query is present,
`Newest` if not.
Having it in the SELECT fields was causing warnings like this:
```
NOTICE: PHP message: PHP Deprecated: Creation of dynamic property Ebook::$RelevanceScore is deprecated in /standardebooks.org/web/lib/Traits/Accessor.php
```
The data in these fields are separate:
* `IndexableText`
* `Title`
* `IndexableAuthors`
* `IndexableCollections`
There are also on indices on each of these fields so that they can have
separate weight in the relevance scoring.
Here's what's in `IndexableText` right now:
1. Title
2. Collections
3. Authors
4. Tags
5. LocSubjects
6. TocEntries
Here is the proposed new ranking:
```
10 * Title +
8 * Authors +
3 * Collections +
IndexableText
```
New indices for existing DBs:
```
ALTER TABLE `Ebooks` ADD COLUMN `IndexableAuthors` text NOT NULL;
ALTER TABLE `Ebooks` ADD COLUMN `IndexableCollections` text NULL;
ALTER TABLE `Ebooks` ADD FULLTEXT `indexSearchTitle` (`Title`);
ALTER TABLE `Ebooks` ADD FULLTEXT `idxSearchAuthors` (`IndexableAuthors`);
ALTER TABLE `Ebooks` ADD FULLTEXT `idxSearchCollections` (`IndexableCollections`);
```
Instead of `USING(EbookId)`, it would be easier to handle
`MultiTableSelect` queries in `FromMultiTableRow()` if the queries used
`ON Projects.EbookId = Ebooks.Ebooks`
This is because `USING` will return only one `EbookId` column, but `ON`
will return all columns from both tables.
On `Ebook::Save()` and `Ebook::Delete()`, remove any unreferenced `Tag`,
`LocSubject`, and `Collection` records. These are analogous to these lines in
`Ebook::Create()`:
```
$this->CreateTags();
$this->CreateLocSubjects();
$this->CreateCollections();
```
`EbookPlaceholder`s can't have `Tags` or `LocSubjects` at the moment, but other
mistakes in production that are later corrected could leave unused `Tags` and
`LocSubjects`.
Context: https://github.com/standardebooks/web/pull/447#issuecomment-2555734692