Log
-
Add a basic .gitignore by Caio 8 years ago
-
More README tweaks 💬 by Caio 8 years ago
Now we frame gopkg.in secondly since the expected preferred installation method is soon gonna be via `get+dep`
-
Make travis use `dep` when testing by Caio 8 years ago
-
Control dependencies via dep 💬 by Caio 8 years ago
This patch freezes the go_rng dependency to a particular commit since the project doesn't tag its stable releases.
-
Add tests for a heavily skewed gamma distribution 💬 by Caio 8 years ago
The code is mostly borrowed from TDigestTest.java and could easily be modified to allow testing with multiple distributions/ranges. Closes #13 Note that this patch adds a new test-only dependency but we don't use any form of dependency management - this will come in a subsequent patch.
-
Add new CDF(float64) public method by Caio 8 years ago
-
Add a simple CONTRIBUTING.md guide by Caio 8 years ago
-
Beef the README up a bit by Caio 8 years ago
-
Remove TDigest.Len() from the public interface 💬 by Caio 8 years ago
I can't think of a scenario where one would really care about how many distinct centroids are in the digest, so away this goes on a major release. Adding it back in case it's needed won't require a major release.
-
Tweak the docs a bit 💬 by Caio 8 years ago
Paragraphs are good!
-
Introduce TDigest.Count() 💬 by Caio 8 years ago
Expose the count of samples publicly so that users can more easily deicde what to do when the digest has too many samples.
-
Change TDigest.count to uint64 (was uint32) 💬 by Caio 8 years ago
So now we can register more than 4B samples and still reply with valid quantile estimations, but notice that centroid counts are still uint32 and that `t.count` is used for floating point divisions across the algorithms so a really big count is not really desirable. Individual centroid counts wrapping around uint32 is a pathological scenario that can totally be simulated and treated for, but that would mean a lot of extra comparisons added in the critical path, so I prefer to leave that to the user's decision.
-
Update README with configuration notes by Caio 8 years ago
-
Get rid of the centroid abstraction 💬 by Caio 8 years ago
This was only being used to pack {float64,uint32}, all the other functionality was skipped or became unused over time for performance reasons. Away it goes. -
Expose options to change the RNG being used 💬 by Caio 8 years ago
This patch creates a new public interface `TDigestRNG` and exposes two new options: - tdigest.RandomNumberGenerator(TDigestRNG) - tdigest.LocalRandomNumberGenerator(seed)
-
Abstract math/rand usage into the TDigestRNG interface by Caio 8 years ago
-
Make Add take only one parameter, introduce AddWeighted 💬 by Caio 8 years ago
This patch renames the previous Add(float64,uint32) to AddWeighted and introduces a method Add(float64) which is simply an alias to AddWeighted(float64,1).
-
Make New() return an error instead of panic()ing 💬 by Caio 8 years ago
This patch now makes New() return a (*TDigest,error) tuple, which makes deserialization safe without having to trap for panic()s. The only remaining panic() is for bad input in a public function (`Quantile(float64)`). I'm keen on keeping it.
-
Introduce a parameter-less New() 💬 by Caio 8 years ago
Now `tdigest.New()` gives a sane ready-to-use-in-most-cases digest. Configuration should be done via self referential functions. Ex: // create a digest with compression of 200 tdigest.New(tdigest.Compression(200)) Notice that New() can still panic, which means that deserialization if still more dangerous than it should. -
Get rid of the x/w aliases to value/count by Caio 8 years ago
-
Get rid of TDigest._{count,mean} aliases 💬 by Caio 8 years ago
This patch gets rid of the _count() and _mean() helpers and move them into the summary package. What I really wanted were macros, but hey ¯\_(ツ)_/¯ Summary is still leaky, but at least TDigest only uses its public interfaces now.
-
Rename `{}summary.keys` to `means` 💬 by Caio 8 years ago
$ gorename -from '"github.com/caio/go-tdigest".summary.keys' -to means
-
Completely rework the quantile estimation codepath 💬 by Caio 8 years ago
This patch is too big, but there isn't much getting away from it in smaller steps because summary{} and TDigest{} are actually tightly coupled (i.e.: the abstraction is mostly useful for code organization but fails at isolation). The major changes are: - Summaries now hold repeated items instead of just unique means and their respective counts (which led to changes in how the digest adds new centroids too) - Quantile estimation is now a straight port from the reference implementation (issue-84 branch) The digest now doesn't potentially report completely wrong values on distributions with multiple steep hills nor biases in favour of big centroids with few occurrences. This patch closes #12. Some historical details for the motivation for this work can be found on PR #11. -
Make TestMerge more thorough 💬 by Caio 8 years ago
This patch makes the test use multiple partitioning configurations and a lot more items (so that we can partition and merge more). This in turn means that the whole test is significantly slower (still very tolerable though) but helps uncover subtle drifts where precision is lower.
-
Output more details when TestRespectBounds fails by Caio 8 years ago
-
Add TestSingletonInACrowd 💬 by Caio 8 years ago
Notice that this test has its exterme quantile (0.999) skipped because the java reference implementation behaves the same. Reasoning and more details on issue #12
-
Drop TestNonUniformDistribution 💬 by Caio 8 years ago
I know it's weird to drop failing tests, but bear with me: This test was added on 011e706e and has worked nicely since then, however I'm having a hard time accepting the error thresholds for this manually crafted distribution. Given that it fails with the java implementation as well (AVL and Merging -based versions), I'm considering it bugged. I'll revisit this one day, but it looks like to me it would be more productive to actually test distributions and their relative errors in a standard manner as in TDigestTest.java --- FAIL: TestNonUniformDistribution (0.00s) tdigest_test.go:85: T-Digest.Quantile(0.2500) = 420.1363 vs actual 499.9531. Diff (79.8168) >= 11.0000 -
Make travis CI test with go 1.9, drop 1.6 by Caio 8 years ago
-
Merge pull request #11 from christineyen/master 💬 by Caio 8 years ago
Ensure returned values stay within bounds
-
Ensure returned values stay within bounds by Christine Yen 8 years ago