# From https://snowballstem.org/algorithms/english/stop.txt
# This file is distributed under the BSD License.
# See https://snowballstem.org/license.html
# Also see https://opensource.org/licenses/bsd-license.html
#  - Encoding was converted to UTF-8.
#  - This notice was added.
#  - Comments were changed from `|` to `#` so that this list can be parsed by OpenNLP's stopword loader.
#

# An English stop word list. Comments begin with vertical bar. Each stop
# word is at the start of a line.

# Many of the forms below are quite rare (e.g. "yourselves") but included for
#  completeness.

# PRONOUNS FORMS
# 1st person sing

i

me
my
# the possessive pronoun `mine' is best suppressed, because of the
# sense of coal-mine etc.
myself
# 1st person plural
we

# us           | object
# care is required here because US = United States. It is usually
# safe to remove it if it is in lower case.
our
ours
ourselves
# second person (archaic `thou' forms not included)
you
your
yours
yourself
yourselves
# third person singular
he
him
his
himself

she
her
hers
herself

it
its
itself
# third person plural
they
them
their
theirs
themselves
# other forms (demonstratives, interrogatives)
what
which
who
whom
this
that
these
those

# VERB FORMS (using F.R. Palmer's nomenclature)
# BE
am
is
are
was
were
be
been
being
# HAVE
have
has
had
having
# DO
do
does
did
doing

# The forms below are, I believe, best omitted, because of the significant
# homonym forms:

#  He made a WILL
#  old tin CAN
#  merry month of MAY
#  a smell of MUST
#  fight the good fight with all thy MIGHT

# would, could, should, ought might however be included

#          | AUXILIARIES
#            | WILL
#will

would

#            | SHALL
#shall

should

#            | CAN
#can

could

#            | MAY
#may
#might
#            | MUST
#must
#            | OUGHT

ought

# COMPOUND FORMS, increasingly encountered nowadays in 'formal' writing
# pronoun + verb

i'm
you're
he's
she's
it's
we're
they're
i've
you've
we've
they've
i'd
you'd
he'd
she'd
we'd
they'd
i'll
you'll
he'll
she'll
we'll
they'll

# verb + negation

isn't
aren't
wasn't
weren't
hasn't
haven't
hadn't
doesn't
don't
didn't

# auxiliary + negation

won't
wouldn't
shan't
shouldn't
can't
cannot
couldn't
mustn't

# miscellaneous forms

let's
that's
who's
what's
here's
there's
when's
where's
why's
how's

# rarer forms

# daren't needn't

# doubtful forms

# oughtn't mightn't

# ARTICLES
a
an
the

# THE REST (Overlap among prepositions, conjunctions, adverbs etc is so
# high, that classification is pointless.)
and
but
if
or
because
as
until
while

of
at
by
for
with
about
against
between
into
through
during
before
after
above
below
to
from
up
down
in
out
on
off
over
under

again
further
then
once

here
there
when
where
why
how

all
any
both
each
few
more
most
other
some
such

no
nor
not
only
own
same
so
than
too
very

# Just for the record, the following words are among the commonest in English

# one
# every
# least
# less
# many
# now
# ever
# never
# say
# says
# said
# also
# get
# go
# goes
# just
# made
# make
# put
# see
# seen
# whether
# like
# well
# back
# even
# still
# way
# take
# since
# another
# however
# two
# three
# four
# five
# first
# second
# new
# old
# high
# long

