0
0
mirror of https://github.com/go-gitea/gitea.git synced 2025-01-10 21:41:24 +01:00
gitea/vendor/github.com/blevesearch/snowballstem/README.md

67 lines
1.9 KiB
Markdown
Raw Normal View History

# snowballstem
This repository contains the Go stemmers generated by the [Snowball](https://github.com/snowballstem/snowball) project. They are maintained outside of the core bleve package so that they may be more easily be reused in other contexts.
## Usage
All these stemmers export a single `Stem()` method which operates on a snowball `Env` structure. The `Env` structure maintains all state for the stemmer. A new `Env` is created to point at an initial string. After stemming, the results of the `Stem()` operation can be retrieved using the `Current()` method. The `Env` structure can be reused for subsequent calls by using the `SetCurrent()` method.
## Example
```
package main
import (
"fmt"
"github.com/blevesearch/snowballstem"
"github.com/blevesearch/snowballstem/english"
)
func main() {
// words to stem
words := []string{
"running",
"jumping",
}
// build new environment
env := snowballstem.NewEnv("")
for _, word := range words {
// set up environment for word
env.SetCurrent(word)
// invoke stemmer
english.Stem(env)
// print results
fmt.Printf("%s stemmed to %s\n", word, env.Current())
}
}
```
Produces Output:
```
$ ./snowtest
running stemmed to run
jumping stemmed to jump
```
## Testing
The test harness for these stemmers is hosted in the main [Snowball](https://github.com/snowballstem/snowball) repository. There are functional tests built around the separate [snowballstem-data](https://github.com/snowballstem/snowball-data) repository, and there is support for fuzz-testing the stemmers there as well.
## Generating the Stemmers
```
$ export SNOWBALL=/path/to/github.com/snowballstem/snowball/after/snowball/built
$ go generate
```
## Updated the Go Generate Commands
A simple tool is provided to automate these from the snowball algorithms directory:
```
$ go run gengen.go /path/to/github.com/snowballstem/snowball/algorithms
```