Tuesday, January 15, 2013

.NET File System and Solution Organization

The file system is the lowest level of organization possible for source code. It’s like the dirt below the foundation of a structure. Maybe not always on the forefront of an engineer’s mind, but if it’s not stable, the foundation will crack as it shifts beneath it. Implementing a standard and strategy for organizing the files impacts the stability and strength of the software built on top.

The Common State

Figure 1. Example root of Main.
Briefly, let’s look at an example source code repository root, call it Main. Figure 1 is the example root of Main. It isn't clear exactly where to start unless of course you are interested in UML – otherwise, it requires digging.

Solutions

Visual Studio solutions play at least 3 important roles:

  • A view and organization into code/projects
  • A direct view of referential integrity between projects
  • A buildable collection of projects (that can be automatically built by TeamCity or other CI tools)

In the example repository, solutions in source control are ad hoc.  Their location within the directory structure and the projects which they combine represent no clear pattern.  This makes the latter 2 of the 3 roles above very difficult to accomplish.   To take advantage of all the roles it is paramount to have enforcement of the file system and solution creation. Additionally, the file system layout should yield the composition of the solutions and reflect the architectural patterns inherent in the software itself.

Since solutions play such a big part in organizing the software, new solutions should be well thought out and should have a purpose beyond simply providing a place to view and edit code.  When too many solutions exist they need to be redefined and reduced.   Before discussing how to do that, let’s go one level lower and look at how projects and code files are organized.

Projects

Visual Studio projects serve the purpose of at minimum:

  • Defining an assembly name and namespace
  • Organizing files (classes, resources, etc.)
  • Defining required references
Figure 2. Variances in project naming.

The primary purpose for projects is the definition of an assembly name and namespace.  In many legacy repositories, not all projects are created equal and do not use a standard pattern or convention for naming. Figure 2 shows a few of the variances that exist project naming in the example repository.

Projects can be named different than the assembly since there is no enforcement built into the naming.  If the project name is the same as the assembly name it is easy to understand where code is physically in the file system.  The assembly name typically would also represent the root namespace of the code within.  This is best practice and is the default when creating new projects, but again, this is also not enforced.  As seen in Figure 3, project properties define the assembly name and the base namespace.  Consistency across these helps to keep things organized and clear.

Figure 3. Project properties define the assembly and base namespace.

There is also a file in every project called AssemblyInfo.cs that describes the assembly once built.  As seen in Figure 4, these values should match that of the project name and properties.  Although it’s beyond the scope of information here, the proper company name and copyright information should also be maintained in this file.

Figure 4. AssemblyInfo.cs should be consistent with the project name and properties.

Project References vs. Assembly References

Project references have their obvious advantages in .NET by enabling rapid refactoring, dependency searching, and navigation. Ideally, a solution that can contain all the projects for a given source repository will provide the best overall experience in .NET development and has many advantages. In some cases this cannot occur simply because scale of the source code.  In other cases, it’s not the size of source code itself, but the sheer number of projects that make having a single solution not feasible. In these situations separate solutions will be required and therefore a pattern for using project vs. assembly references will fall out from the organization of the solution itself.
Figure 5. Project references vs. internally built vs. 3rd party.

An important rule to remember is, references to code that are part of a project in the same solution should always be a project reference. This is shown with the highlighted projects in Figure 5. The blue underlined reference is an assembly reference to an internally built project that is in a common build folder. The black underlined reference is a 3rd party binary reference that is in the SharedBinaries folder.

Further Reading

The topics covered here can be further investigated on MSDN.  Below are a few resources that provide suggestions on file system layouts.  The layouts depend on the team, the software, the source control and more.  The ideas and recommendations extracted here are based on all these influences, but also with the clear intention of having a clean organized repository.


Recommendations

The recommendations come from both experiences with many source code repositories as well as MSDN where there is an examination of the pros and cons from alternative styles.  This is a template and is very open to suggestion and customization for the needs and preferences of the team.

File System
Figure 6. Proposed example root of Main.

Simplifying and separating the root of the example Main might yield the folders shown in Figure 6.  The following should be separated based on different types of content.

  • .NET source code
  • Non .NET source code
  • Database code
  • Documentation, diagrams and examples

The cleanest separation in distributed source control (such as Mercurial or Git) is to create individual repositories for each of the above.  For this example, TFS is the source repository.  Creating separate TFS projects tends to make things more difficult because of branching/merging concerns.  The use of folders at the root level can provide the same separation without the hassle of cross project/repository concerns lacking in some source control systems.

Databases
Figure 7. Contents of /Main/Databases.

In /Main/Databases all the source code, scripts, configuration, projects and solutions would exist as shown in Figure 7. Usually, this can be organized in a single solution where all the database projects are included and organized by Solution Folders. Doing so provides a clean, manageable view into the database projects. See the Solutions Section for further information on project organization within that folder.

Not addressed here are items like custom scripts and files. These can be included in the solution itself and organized within a folder in Source directory. This allows a developer to open the solution and have access to all the code that is part of the databases. Projects like that of Analysis Services may be broken out into a separate solution if needed and placed in /Main/Databases. It is common that all developers may otherwise need to install specialized software for opening those types of projects.

Documentation

In /Main/Documentation would exist all the documentation organized in a way that provides clarity of where to find things. This may include individual projects that have code (such as examples), but mostly this will be diagrams and tutorials describing the software. This folder has a bit more flexibility in the organization since it is not source code and it is not critical to the built product.

Non-.NET Languages

In /Main/PHP is only PHP code, no .NET code.  This folder is only an example of what might exist. This serves as an example for handling non-.NET code to separate it out into well-organized folders.  These should be organized in a manner that supports the platform and IDE for that language type.

Source
Figure 8. Proposed example /Main/Source.

In /Main/Source would be all the .NET code and everything necessary to build the .NET products. This folder would be sorted with sub-folders that have projects associated with a particular layer or major function. Figure 8 shows a proposed root of the example /Main/Source. Notice, the root of this directory has a few build scripts, some configuration files, but no solutions. The sub-folders contain the solution files for each of the major categories.

Figure 9. Generic sub-folder example.

Each sub-folder has the same structure, with the exception of SharedBinaries which has its own organizational pattern.  Figure 9 shows the generic pattern for the folder structure within a sub-folder. Within the Source and UnitTests folders are all the project folders with the .*proj project files only a single level deep. Each project should have a proper namespace and the folder name should reflect that pattern as seen in Figure 10. In the UnitTests folder, each test project should be named identical to the source project that it is testing (the companion project) and it should end with the .Test naming convention as seen in Figure 11.

Figure 10. Project directories in Source.
Figure 11. Test Project directories in UnitTests.

Each sub-folder in /Main/Source has a single solution.  No other solutions should be created unless a new major subcategory is defined. Usually with this distribution of projects into subcategories there will be a reasonable amount of projects in each category. Within the solution, the projects can be grouped using solution folders which will be described further in the Solutions Section. All content in a sub-folders Source or UnitTests folders should be part of the solution. Any content that is not part of the solution should be removed from source control or added to the solution appropriately. New projects will be added to existing solutions and will need to be placed into the Source or UnitTests folders based on the project type.

Solutions
Figure 12.  A proposed solution.

The solutions should properly organize and separate the projects using solution folders that accurately reflect the namespaces, in most, but not all cases this is possible. Sometimes there will be a need for a Common folder that has a mismatch of projects for example. Grouping projects by namespace enables quick navigation to the projects of interest.

Figure 12 shows a proposed solution. Notice how the Services and Repositories are broken out where the test projects are in their own Services.Test and Repositories.Test folders respectively. This simplifies the programmers view when coding by reducing the number of projects in any particular folder. Since tests are developed separately from the main source code it simplifies the overall view and navigation. Notice how in Web this is not the case. When the number of projects is reasonable, it facilitates visually to leave the test alongside its companion project.

Long Term Goal
Figure 13. A long term goal. Main.sln.

Although the source code in many repositories may be rather large, many times this is because of its current organizational patterns rather than actual lines of code and complexity of the application. After an initial effort of organizing the file system and solutions, there will be an opportunity to simplify and reduce the number of projects, clean up namespaces and perform other organizational tasks that will simplify the overall source code.  In the long term, it may be possible to have a Main solution, as shown in Figure 13, which has all the .NET projects included (excluding databases).  Obviously, an enormous amount of projects in a single solution is not appropriate. If an organization is able to achieve this goal, there are extraordinary benefits to programmers and the business in the ability to implement new features and maintain the source code.