Frequently Asked Questions about The Bear Test

Spoilers abound below. Take the test first before you read on.

Where did this test come from?

The answer to that question is lost in the mists of time. I first heard of the test when I was attending Los Alamitos High School back in the stone age.

I was riding around with Stan Abrams and Scott Rainey in Dave Terrell’s van when Stan started asking me a whole lot of stupid questions about my fantasy world. Then he explained how my answers were interpreted. It was a funny little exercise, and at the next party I attended I tested a few people. It was a hit, and people were begging (well, politely asking) me to test their friends all the time after that.

Since then, I’ve seen the test all over the ’net. Variations include seeing if the subject will get wet when crossing the river and having the bear represent a MOTAS.

Why do you interpret (the bear/the cup/the water) like that, when you’re supposed to interpret it like this?

There are hundreds if not thousands of variations of the Bear Test on the ’net; this is but one of them. Feel free to implement your own variation.

What’s the history of the Interactive Bear Test?

The Interactive Bear Test started out in September 1997 as a CGI program designed to teach me how to write for Perl 5. (It was also written in response to some idiot whose name I’ve forgotten who said it was impossible to simulate interactive sessions on the web without the use of cookies.) The results used to be mailed to the reviewer as plain text and manually formatted for the web. This system worked fine as long as the number of tests averaged one per week.

On Black Thursday (19 November 1998), I received sixteen tests. The following day I received an additional ten tests. The Bear Test had started hitting the mailing lists (webboards — the current bane of my existence — weren’t around back then). Something had to give. The interim solution (which, much to my chagrin, lasted two years) was to cap the number of tests per day to ten (the current cap hovers between thirty and fifty, depending on my mood).

I had started work on my Master’s around then so the Bear Test was neglected. Late in 2001 I started taking an interest in XML. The Interactive Bear Test seemed to be a fine candidate for XML, as the strong majority of answers were boilerplate. It took about six months of work to translate my hand-coded HTML files into basic XML, but the end result was spectacular. The result files could be edited by labeling the answers instead of writing complete analyses.

During this time I was completing my Master’s in Artificial Intelligence (a worthless endeavor, but that’s another story). One of the last projects I worked on was the modeling and prediction of user behavior based upon past interactions. One afternoon, I thought, “Wouldn’t it be clever if I modeled new analyses of Bear Tests based upon past analyses?” Two months later, the AI version of the Bear Test was released.

The results were surprising. On well-written submissions, the accuracy was around 70%, higher if you count acceptable misclassifications. Most of the poor performance results from not reading the instructions (the AI has a problem with negative assertions) and spelling errors.

How does it work?

How the multiple-choice answers work is self-evident. The free-form answers are a little more complex. Basically, the test determines scores in each category for every dictionary word (where “dictionary” is defined to be the Unix spelling list — thus misspellings and IM glyphs are ignored). The scores for all the words in the submission text are totaled and the text is assigned to the category with the best score. How word scores are determined is rather complex and covered in a paper I wrote about the test for my own amusement.

So, what are the categories? What are their interpretations?

Ah, the most popular question of all! The categories are summarized in the following table:

SectionCategoryPossible ValuesInterpretation
Room Description
(25 states)
ComfortHellishA highly traumatic childhood.
UncomfortableA childhood that was devoid of happiness.
AverageA bland, uninspiring childhood.
ComfortableA childhood that was pleasant.
HeavenlyA childhood filled with joy.
FurnishingBareA complete absence of memories.
SpartanFew memories of that time.
AverageNormal memories of childhood.
DecoratedStrong memories of childhood.
DetailedRich, lasting memories of childhood.
Tree Description
(25 states)
SizeStuntedDearth of adult interaction with the subject.
SmallAdults had a weak influence on the subject.
AverageThe normal influence adults have on a child.
LargeAdults had a strong influence on the subject.
GiantAdults had a significant and substantial impact on the subject.
LightingDarkOppressed by the attentions of the adults.
DuskySomewhat oppressed by the attention the adults gave.
AverageReceived enough attention to be guided but not oppressed.
LitConsiderable freedom at this time.
BrightHad extensive freedom while growing up.
Path Description
(32 states)
VisibilityVisibleGood ideas of what to expect from adolescence.
PoorConfused by the changes brought on by adolescence.
WidthWideHad numerous options for emotional growth.
NarrowHad limited options for emotional growth.
UseLittleStrong feelings of isolation at that time.
FrequentReceived a lot of support from friends and family.
ObstructionsNoneHad no problems during adolescence.
Trees*Problems arose mostly from interactions with adults.
SomeOccasional problem in adolescence.
ManyMany problems during adolescence.
Water Description
(20 states)
MovementStagnant†A sex drive that is absent or pathologically inactive.
GentleA passive, restrained, calm sex drive.
AverageA normal, average sex drive.
FastA strong, active sex drive.
RapidA powerful, vigorous, compulsive sex drive.
ClarityClearHas no issues regarding sex.
Murky†Has significant issues regarding sex.
LifePresentA strong desire for children.
Absent(Normal desire for children.)
Cup Description
(4 states)
UtilityPracticalPragmatic when it comes to questions of marriage.
DecorativeViews marriage as a romantic adventure.
BothConsiders both romantic and pragmatic aspects of marriage.
WorthlessCynical about the institution.
Key Description
(27 states)
PurposeOrdinaryNo extraordinary expectations about the career.
VersatileNumerous but unfocused expectations about the career.
MagicalUnreasonably high expectations about the career.
MonetaryFixated on gaining wealth through the career.
PoliticalIndicative of a desire for power.
Path‡Expected to solve a life-problem.
Personal‡Expected to solve other people’s problems.
WorthlessCynical about finding any satisfaction through a job.
UnknownDoes not have a career goal.
AppearanceEverydayDesires a nondescript career.
DecorativeWants an attention-grabbing, one-of-a-kind career.
AntiqueDesires a traditional career.

A couple of notes on the above:

I didn’t like the answers you gave me! What are you going to do about it?

I will either remove the submission or make it an anonymous entry by removing your name. Do not ask me to change it (unless it’s a spelling correction); this weakens the AI.

This really isn’t much of an artificial intelligence, is it? I mean, it’s easily confused.

A lot of people aren’t able to separate what I call “high AI” from “low AI.” High AI concerns itself with the mimicking of human behavior: emotional interpretation, creativity, adaptive interaction, logical deduction. Low AI is more pragmatic, with goals of reducing effort in solving complex problems by providing heuristics that guide the algorithm to a solution. Low AI is concerned with search and discovery.

Low AI is here and now: spam filters and traffic routing on the Internet use AI techniques to make the difficult mundane.

The algorithm used is a classic Naïve Bayes classifier. Since it is the result of serious AI research, it is a proper AI. It can’t serve up tea or pilot your spaceship to Jupiter, but computers are decades — even centuries — away from doing that.

I took the test multiple times until I got the answer I wanted. Do you mind?

(OK, this is not a FAQ — but I wish it were.) Taking the test multiple times with nearly identical answers weakens the AI. I reserve the right to delete subsequent submissions, since the first is usually the honest one.

Personally, I’ve never seen the need to cheat on a psychology test, especially an irrelevant (and irreverent) one such as this. Of course, a compulsive need to get the answers “right” tells me more about your personality than any test ever could.

My test is gone! What happened?

I have horrible luck with ISPs and hard drives. The occasional test gets lost during disk crashes and file transfers. Please accept my heartfelt apologies.

Why does the test get locked? What does it mean to be “provisional”?

The bear test is a work in progress. The AI has only a small corpus (set of examples) right now, so its accuracy is limited. By hand-correcting the analyses and adding them to the corpus, I make the AI more powerful.

The system locks the test if I fall behind in correcting tests. “Provisional” is the code-word for uncorrected.