The Early Modern Commons

Search Results for "spelling"

Your search for posts with tags containing spelling found 10 posts

The Monkey’s, a satirical print published in 1804. How...

The Monkey’s, a satirical print published in 1804. How comforting to learn that the use of apostrophe-s in place of the regular plural with a simple S has been going on for such a long time. I don’t know what that dog is doing, so I’ve...

Forcing Standardization in VARD, Part 2

The final aspect of standardization I will discuss will be common early modern spellings forced to modern equivalents, decisions where the payoff of consistency outweighs slight data loss. The VEP team decided to force bee > be, doe > do, and wee...
From: Visualizing English Print on 28 Aug 2015

Forcing Standardization in VARD, Part 1

Optimizing VARD for the early modern drama corpus required “forcing” lexical changes to create higher levels of standardization in the dataset. Jonathan Hope gave me editorial principles to follow as we considered what words/patterns VARD...
From: Visualizing English Print on 28 Aug 2015

VARD Normalization Errors

VARD decently standardizes Early Modern English. Sometimes, though, it makes questionable replacements. ORIGINAL NORMALIZATION SHOULD BE all’s ell’s all’s caus’d cause caused Cicilia Cicely Cicilia courtesie curtsy courtesy diuers...
From: Visualizing English Print on 25 Aug 2015

Tweaking VARD: Aggressive Rules for Early Modern English Morphemes and Elisions

Since I have discussed how VARD behaves with character encoding and symbols, I will devote space to explaining how I tweaked VARD to standardize Jonathan Hope’s early modern drama corpus. Given the size of Hope’s corpus, it required automating...
From: Visualizing English Print on 24 Aug 2015

Liest thou, or hast a Rewme? Getting the best from VARD and EEBO

This week, we’ve replaced the default VARD set-up with a version designed to optimise the tools for VARD. In essence, this includes a lengthier set of rules to guide the changing of letters, and lists of words and variants that
From: Linguistic DNA on 3 Aug 2015

Illustrating the tools: first insights on VARD & MorphAdorner

The Sheffield RAs are hard at work on our audit of Early English Books Online, figuring out how best to clean up the TCP data for Linguistic DNA’s research goals. In the last post, Seth documented our intention to try out...
From: Linguistic DNA on 24 Jul 2015

EEBO-TCP and standard spelling

The Linguistic DNA project relies on two very large linguistic data sources for evidence of semantic and conceptual change from c.1500 to c.1800—Early English Books Online Text Creation Partnership dataset (EEBO-TCP),and Gale Cengage’s Eighteenth...
From: Linguistic DNA on 10 Jul 2015

Choice Tags: A Search Function’s Best Friend

<choice> is the xml element we use to encode alternative spelling in our transcriptions of Blake’s writings. It’s what makes the Blake Archive’s search function forgiving. Say someone searches for all instances of the word “Tiger” in...

An error not to be exonerated

Yesterday evening saw me sitting in the gods, with the young people, at Oxford’s New Theatre for a performance of La bohème – an experience that made me rue how many years I have wasted not going to the opera. I think the last occasion I...

Notes on Post Tags Search

By default, this searches for any categories containing your search term: eg, Tudor will also find Tudors, Tudor History, etc. Check the 'exact' box to restrict searching to categories exactly matching your search. All searches are case-insensitive.

This is a search for tags/categories assigned to blog posts by their authors. The terminology used for post tags varies across different blog platforms, but WordPress tags and categories, Blogspot labels, and Tumblr tags are all included.

This search feature has a number of purposes:

1. to give site users improved access to the content EMC has been aggregating since August 2012, so they can look for bloggers posting on topics they're interested in, explore what's happening in the early modern blogosphere, and so on.

2. to facilitate and encourage the proactive use of post categories/tags by groups of bloggers with shared interests. All searches can be bookmarked for reference, making it possible to create useful resources of blogging about specific news, topics, conferences, etc, in a similar fashion to Twitter hashtags. Bloggers could agree on a shared tag for posts, or an event organiser could announce one in advance, as is often done with Twitter hashtags.

Caveats and Work in Progress

This does not search post content, and it will not find any informal keywords/hashtags within the body of posts.

If EMC doesn't find any <category> tags for a post in the RSS feed it is classified as uncategorized. These and any <category> 'uncategorized' from the feed are omitted from search results. (It should always be borne in mind that some bloggers never use any kind of category or tag at all.)

This will not be a 'real time' search, although EMC updates content every few hours so it's never very far behind events.

The search is at present quite basic and limited. I plan to add a number of more sophisticated features in the future including the ability to filter by blog tags and by dates. I may also introduce RSS feeds for search queries at some point.

Constructing Search Query URLs

If you'd like to use an event tag, it's possible to work out in advance what the URL will be, without needing to visit EMC and run the search manually (though you might be advised to check it works!). But you'll need to use URL encoding as appropriate for any spaces or punctuation in the tag (so it might be a good idea to avoid them).

This is the basic structure:

http://commons.earlymodernweb.org/searchcat?s={search term or phrase}

For example, the URL for a simple search for categories containing London:

http://commons.earlymodernweb.org/searchcat?s=london

The URL for a search for the exact category Gunpowder Plot:

http://commons.earlymodernweb.org/searchcat?s=Gunpowder%20Plot&exact=on

In this more complex URL, %20 is the URL encoding for a space between words and &exact=on adds the exact category requirement.

I'll do my best to ensure that the basic URL construction (searchcat?s=...) is stable and persistent as long as the site is around.