File Organization Patterns Database

Comprehensive Reference for Automated Directory Management

Pattern Overview

This database contains proven file organization patterns, classification rules, and best practices for automated directory management across different project types and organizational contexts.

Project Type Classification Patterns

1. Python Machine Learning Projects

python_ml_patterns:
  identification_indicators:
    file_extensions: [".py", ".ipynb", ".pkl", ".h5", ".csv", ".json"]
    import_patterns:
      - "import pandas"
      - "import numpy"
      - "import sklearn"
      - "import tensorflow"
      - "import torch"
      - "import matplotlib"
    directory_hints: ["data", "models", "notebooks", "experiments"]
    
  classification_rules:
    data_files:
      patterns: ["*.csv", "*.json", "*.parquet", "*.h5", "*.pkl", "*.npy"]
      content_indicators: ["dataset", "training", "test", "validation", "features", "labels"]
      size_thresholds:
        raw_data: "> 1MB -> src/data/raw/"
        processed_data: "> 10MB -> src/data/processed/"
        model_artifacts: "> 50MB -> src/models/"
      destination_logic: "size_and_content_based"
      
    notebook_files:
      patterns: ["*.ipynb"]
      content_analysis:
        exploratory: "contains 'explore', 'EDA', 'analysis' -> src/notebooks/exploratory/"
        modeling: "contains 'model', 'train', 'fit' -> src/notebooks/modeling/"
        visualization: "contains 'plot', 'chart', 'graph' -> src/notebooks/visualization/"
        preprocessing: "contains 'clean', 'preprocess', 'transform' -> src/notebooks/preprocessing/"
      cell_count_threshold: "> 5 cells indicates active notebook"
      
    script_files:
      patterns: ["*.py"]
      ast_analysis:
        main_scripts: "contains 'if __name__ == \"__main__\"' -> src/scripts/"
        utility_functions: "only function definitions -> src/utils/"
        model_classes: "contains 'class.*Model' -> src/models/"
        test_files: "contains 'def test_' or 'import unittest' -> src/tests/"
        configuration: "contains config/settings variables -> config/"
      import_analysis:
        data_processing: "imports pandas/numpy heavily -> src/data/"
        model_training: "imports sklearn/tensorflow -> src/models/"
        visualization: "imports matplotlib/seaborn -> src/visualization/"

2. Web Application Projects

3. Research and Documentation Projects

Content-Based Classification Patterns

Natural Language Processing Indicators

Code Analysis Patterns

File Naming Convention Patterns

Standard Naming Patterns

Size-Based Classification Patterns

File Size Thresholds

Temporal Classification Patterns

Time-Based Organization

Integration Patterns

Squad Coordination Patterns

Success Metrics and Validation

Pattern Effectiveness Metrics

This comprehensive pattern database serves as the knowledge foundation for intelligent file organization, enabling the squad to make informed decisions about file placement, structure optimization, and maintenance strategies.

Last updated