1 - Array Operations

Provides support functions enabling fast array operations

PlaidCloud expressions and filters provide use of most non-administrative Apache MADLib methods. Apache MADLib methods are accessed by prefixing the standard method name with func.madlib..

In SQL

madlib.array_add(array1,array2);

In PlaidCloud Expressions & Filters

func.madlib.array_add(array1,array2)

External References

Apache MADLib Official Documentation for these methods can be found here.

Additional capabilities and usage examples can be found in the Apache MADLib documentation.

2 - Encoding Categorical Variables

Coding categorical variables into one-hot, dummy, effects, orthogonal, and Helmert

PlaidCloud expressions and filters provide use of most non-administrative Apache MADLib methods. Apache MADLib methods are accessed by prefixing the standard method name with func.madlib..

In SQL

madlib.encode_categorical_variables ('abalone', 'abalone_out', 'height::TEXT');

In PlaidCloud Expressions & Filters

func.madlib.encode_categorical_variables ('abalone', 'abalone_out', 'height::TEXT')

External References

Apache MADLib Official Documentation for these methods can be found here.

Additional capabilities and usage examples can be found in the Apache MADLib documentation.

3 - Low-Rank Matrix Factorization

Represent an incomplete matrix using a low-rank approximation

PlaidCloud expressions and filters provide use of most non-administrative Apache MADLib methods. Apache MADLib methods are accessed by prefixing the standard method name with func.madlib..

In SQL

madlib.lmf_igd_run('lmf_model', 'lmf_data', 'row', 'col', 'val', 999, 10000, 3, 0.1, 2, 10, 1e-9);

In PlaidCloud Expressions & Filters

func.madlib.lmf_igd_run('lmf_model', 'lmf_data', 'row', 'col', 'val', 999, 10000, 3, 0.1, 2, 10, 1e-9)

External References

Apache MADLib Official Documentation for these methods can be found here.

Additional capabilities and usage examples can be found in the Apache MADLib documentation.

4 - Matrix Operations

Provides basic matrix operations for matrices that are too big to fit in memory

PlaidCloud expressions and filters provide use of most non-administrative Apache MADLib methods. Apache MADLib methods are accessed by prefixing the standard method name with func.madlib..

In SQL

madlib.matrix_trans('"mat_B"', 'row=row_id, val=vector', 'mat_r');

In PlaidCloud Expressions & Filters

func.madlib.matrix_trans('"mat_B"', 'row=row_id, val=vector', 'mat_r')

External References

Apache MADLib Official Documentation for these methods can be found here.

Additional capabilities and usage examples can be found in the Apache MADLib documentation.

5 - Norms and Distance Functions

Useful utility functions for basic linear algebra operations

PlaidCloud expressions and filters provide use of most non-administrative Apache MADLib methods. Apache MADLib methods are accessed by prefixing the standard method name with func.madlib..

In SQL

madlib.squared_dist_norm2(a, b);

In PlaidCloud Expressions & Filters

func.madlib.squared_dist_norm2(a, b)

External References

Apache MADLib Official Documentation for these methods can be found here.

Additional capabilities and usage examples can be found in the Apache MADLib documentation.

6 - Path

Performs regular pattern matching over a sequence of rows

PlaidCloud expressions and filters provide use of most non-administrative Apache MADLib methods. Apache MADLib methods are accessed by prefixing the standard method name with func.madlib..

In SQL

madlib.path('eventlog', 'path_output', 'session_id', 'event_timestamp ASC', 'buy:=page=''CHECKOUT''', '(buy)', 'sum(revenue) as checkout_rev', TRUE);

In PlaidCloud Expressions & Filters

func.madlib.path('eventlog', 'path_output', 'session_id', 'event_timestamp ASC', "buy:=page='CHECKOUT'", '(buy)', 'sum(revenue) as checkout_rev', True)

External References

Apache MADLib Official Documentation for this method can be found here.

Additional capabilities and usage examples can be found in the Apache MADLib documentation.

7 - Pivot

Perform basic OLAP type operations on data

PlaidCloud expressions and filters provide use of most non-administrative Apache MADLib methods. Apache MADLib methods are accessed by prefixing the standard method name with func.madlib..

In SQL

madlib.pivot('pivset_ext', 'pivout', 'id', 'piv', 'val', 'sum');

In PlaidCloud Expressions & Filters

func.madlib.pivot('pivset_ext', 'pivout', 'id', 'piv', 'val', 'sum')

External References

Apache MADLib Official Documentation for this method can be found here.

Additional capabilities and usage examples can be found in the Apache MADLib documentation.

8 - Sessionize

Performs time-oriented session reconstruction on a data set comprising a sequence of events

PlaidCloud expressions and filters provide use of most non-administrative Apache MADLib methods. Apache MADLib methods are accessed by prefixing the standard method name with func.madlib..

In SQL

madlib.sessionize('eventlog', 'sessionize_output_view', 'user_id', 'event_timestamp', '0:30:0');

In PlaidCloud Expressions & Filters

func.madlib.sessionize('eventlog', 'sessionize_output_view', 'user_id', 'event_timestamp', '0:30:0')

External References

Apache MADLib Official Documentation for this method can be found here.

Additional capabilities and usage examples can be found in the Apache MADLib documentation.

9 - Single Value Decomposition

Factorization of a real or complex matrix, with many useful applications in signal processing and statistics

PlaidCloud expressions and filters provide use of most non-administrative Apache MADLib methods. Apache MADLib methods are accessed by prefixing the standard method name with func.madlib..

In SQL

madlib.matrix_sparsify('mat', 'row=row_id, val=row_vec', 'mat_sparse', 'row=row_id, col=col_id, val=value');

In PlaidCloud Expressions & Filters

func.madlib.matrix_sparsify('mat', 'row=row_id, val=row_vec', 'mat_sparse', 'row=row_id, col=col_id, val=value')

External References

Apache MADLib Official Documentation for these methods can be found here.

Additional capabilities and usage examples can be found in the Apache MADLib documentation.

10 - Sparse Vectors

Provides compressed storage of vectors that have many duplicate elements

PlaidCloud expressions and filters provide use of most non-administrative Apache MADLib methods. Apache MADLib methods are accessed by prefixing the standard method name with func.madlib..

In SQL

madlib.gen_doc_svecs('svec_output', 'dictionary_table', 'id', 'term', 'documents_table', 'id', 'term', 'count');

In PlaidCloud Expressions & Filters

func.madlib.gen_doc_svecs('svec_output', 'dictionary_table', 'id', 'term', 'documents_table', 'id', 'term', 'count')

External References

Apache MADLib Official Documentation for these methods can be found here.

Additional capabilities and usage examples can be found in the Apache MADLib documentation.

11 - Stemming

Provides a basic stemming operation for text input using the Porter Stemming Algorithm

PlaidCloud expressions and filters provide use of most non-administrative Apache MADLib methods. Apache MADLib methods are accessed by prefixing the standard method name with func.madlib..

In SQL

madlib.stem_token(word)

In PlaidCloud Expressions & Filters

func.madlib.stem_token(word)

External References

Apache MADLib Official Documentation for this method can be found here.

Additional capabilities and usage examples can be found in the Apache MADLib documentation.