Feature Extraction
Feature Community Detection
- Randomly detecting communities in graphs for feature extraction
- Triggers based on user specified queries?
```mermaid flowchart LR A[Load Graph] --> B[Community Detection] --> C[Extract Features] A[Load Graph] <--> D[Reorder Nodes] D[Reorder Nodes] <--> C[Extract Features]
```
Community Feature Extraction
- Input a comma separated flat file, load into data frame
- Extract all the headers of that file
- Create permutations for all combinations of headers in the file
- Store the permutations in an array of sets
- Each permutation are headers used to create a temporary graph
- For each graph created from a permutation
- Run the community detection algorithm on the graph
- Count the amount of communities
- Get the average size of the communities
- Select the headers that create a graph with the largest communities
- Return a list of headers that the user should use for data investigation
Correlation Matrix
- Input comma separated flat file, load into data frame
- Extract the headers of the of the data frame
- Calculate correlation coefficient