Python wants to convert CSV file to YAML, but it doesn't print well

Asked 2 years ago, Updated 2 years ago, 130 views

Python wants to convert CSV to YAML, but it doesn't output as expected.

version, 3
services,
db,
container_name, atcoder-stream-db
image, postgres: 12.1
express, ""-5432""

web-backend,
container_name, atcoder-stream-backend
build,
context, ./atcoder-stream-backend
dockerfile, Dockerfile
volumes
,- ./atcoder-stream-backend/atcoder-stream-api
,- ./atcoder-stream-backend/libraries/lib/lib
,- ./atcoder-stream-backend/libraries/twitterapi/twitterapi
ports, "-"8000:8000"
depend_on, -db
command, "sh-c" "python/app/src/atcoder-stream-api/manage.py migrate & python"
web-frontend,
container_name, atcoder-stream-frontend
build,
context, ./atcoder-stream-frontend
dockerfil, Dockerfile
volumes,
,- ./atcoder-stream-frontend:/app
port, ""-""3000:3000""
command, "sh-c", "cd/app&yarn start"
version:"3"

services:
  db:
    container_name —atcoder-stream-db
    image:postgres:12.1
    expose:
      - "5432"

  web-backend:
    container_name —atcoder-stream-backend
    build:
      context:./atcoder-stream-backend
      dockerfile —Dockerfile
    volumes:
      - ./atcoder-stream-backend/atcoder-stream-api
      - ./atcoder-stream-backend/libraries/lib/lib
      - ./atcoder-stream-backend/libraries/twitterapi/twitterapi
    ports:
      # host:container
      - "8000:8000"
    depend_on:
      - db
    command: sh-c "python/app/src/atcoder-stream-api/manage.py migrate &&python/app/src/atcoder-stream-api/manage.py runserver 0.0.0.0:8000"

  web-frontend:
    container_name —atcoder-stream-frontend
    build:
      context:./atcoder-stream-frontend
      dockerfile —Dockerfile
    volumes:
      - ./atcoder-stream-frontend:/app
    ports:
      # host:container
      - "3000:3000"
    command: sh-c "cd/app&yarn start"

Python wrote:

import pandas as pd
from pathlib import Path
import yaml

path = Path('keywords.csv')
df=pd.read_csv(path, encoding='cp932')
# print(df)
# path2 = Path('data2.csv')
# df.to_csv(path2, encoding='cp932', index=False)
# print(df)
df=pd.read_csv(path, encoding='cp932', index_col=0, header=None, na_filter=False)
with open('config.yml', 'w') as yaml_file:
    yaml.dump(
        df.to_dict(orient='dict'),
        yaml_file,
        sort_keys = False,
    )
1:
  version: '3'
  services: ''
  db:'
  container_name —atcoder-stream-frontend
  image:postgres:12.1
  express: '-5432'
  web-backend:'
  build:'
  context:./atcoder-stream-frontend
  dockerfile —Dockerfile
  volumes: ''
  ? ''
  : '-./atcoder-stream-frontend:/app'
  ports: '- "8000:8000"
  depend_on: '-db'
  command: 'sh-c "cd/app&yarn start"
  web-frontend:'
  dockerfile —Dockerfile
  port: '- "3000:3000"
  • There is ' in the blank line and I want to delete it
  • db and web-backend: have container_name, but web-backend does not recognize it.
  • Data under value is not output.

Is there a problem with the csv description?

I have information to convert from YAML to CSV, but I don't have much information on how I do it.

python csv yaml

2022-09-30 19:39

1 Answers

Most likely the reason is that the csv configuration writes many items flat at the same level.

  • There is ' in the blank line and I want to delete it.
    → Probably because there are blank lines in CSV.You may want to delete the blank line of CSV.

  • db and web-backend: containers_name are present, but web-backend does not recognize them.
    → Because these items are written flat, if you have the same name, it will be replaced with a later value.

  • Data under
  • value is not output.
    →It is not value but volumes typeo.
    You will need to know what conversion to_dict() of pandas DataFrame will do, what kind of DataFrame/dictionary configuration should be used to get the desired .yml, and what CSV should do to do with it.

There is ' in the blank line and I want to delete it.
→ Probably because there are blank lines in CSV.You may want to delete the blank line of CSV.

db and web-backend: container_name is present in , but web-backend does not recognize it.
→ Because these items are written flat, if you have the same name, it will be replaced with a later value.

Data under value is not output.
→It is not value but volumes typeo.
You will need to know what conversion to_dict() of pandas DataFrame will do, what kind of DataFrame/dictionary configuration should be used to get the desired .yml, and what CSV should do to do with it.

Considering the questions and the above, it seems difficult to get the desired .yml just by to_dict() of the simple pandas DataFrame.

"There is a site like the one below, and when I put ""Assumed YAML File"" and converted it, I couldn't do CSV, but JSON was able to do it."
Best Online YAML Converter
Now that the results of the above page are like this, if you can create a dictionary equivalent to this, you will be able to get .yml.

{
    "version": "3",
    "services": {
        "db": {
            "container_name": "atcoder-stream-db",
            "image": "postgres:12.1",
            "expose": [
                "5432"
            ]
        },
        "web-backend": {
            "container_name": "atcoder-stream-backend",
            "build": {
                "context": "./atcoder-stream-backend",
                "dockerfile": "Dockerfile"
            },
            "volumes": [
                "./atcoder-stream-backend/atcoder-stream-api",
                "./atcoder-stream-backend/libraries/lib/lib",
                "./atcoder-stream-backend/libraries/twitterapi/twitterapi"
            ],
            "ports": [
                "8000:8000"
            ],
            "depends_on": [
                "db"
            ],
            "command": "sh-c\" python/app/src/atcoder-stream-api/manage.py migrate&&python/app/src/atcoder-stream-api/manage.py runnerver 0.0.0.0:8000\"
        },
        "web-frontend": {
            "container_name": "atcoder-stream-frontend",
            "build": {
                "context": "./atcoder-stream-frontend",
                "dockerfile": "Dockerfile"
            },
            "volumes": [
                "./atcoder-stream-frontend:/app"
            ],
            "ports": [
                "3000:3000"
            ],
            "command": "sh-c\" cd/app&yarn start\"
        }
    }
}

In order to create such nested data or to create a list (array) in the middle, CSVs will also need to take that into consideration, and simple to_dict() will be difficult to list only a part of it.

Also, in the PyYAML module, if you leave it as it is, the string is expressed in a single quotation, so if you want to use it differently, you need to be creative.
Many articles seem to be easy to answer using ruamel.yaml, so I will apply it.
Python-yaml script and double quotes
How to print a value with double quotes and spaces in YAML?
Python YAML dumper single quote and double quote issue
Pythonyaml generate new values in quotes

For example, Excel can create a table like this.
ExcelImage

Here is the corresponding CSV.

version, ""3"", "",,,
services, db,,,
, container_name, atcoder-stream-db,
, image, postgres:12.1,
,expose, ""-""532",
, web-backend, container_name, atcoder-stream-backend,
,build,context, ./atcoder-stream-backend
,,,,dockerfile,Dockerfile
, volumes, -./atcoder-stream-backend/atcoder-stream-api,
,,,- ./atcoder-stream-backend/libraries/lib/lib,
,,,- ./atcoder-stream-backend/libraries/twitterapi/twitterapi,
, ports, "-"8000:8000",
,depends_on, -db,
, command, "sh-c" "python/app/src/atcoder-stream-api/manage.py migrate &&python/app/src/atcoder-stream-api/manage.py runserver 0.0.0.0:8000"",
, web-frontend, container_name, atcoder-stream-frontend,
,build,context, ./atcoder-stream-frontend
,,,,dockerfil,Dockerfile
, volumes, -. /atcoder-stream-frontend: /app,
, port, ""-""3000:3000",
command, "sh-c", "cd/app&yarn start",

The Python script looks like this: fill in the blanks in the CSV you read, cut them out with groupby, and build a nested structural dictionary.
This is a sequential processing program that specializes in .yml questions, so there may be a lot of things to do, but if you try to do what you can, you'll see something like this.

import pandas as pd
from pathlib import Path
import ruamel.yaml
from ruamel.yaml.scalarstring import DoubleQuotedScalarString as dq
yaml=ruamel.yaml.YAML()
yaml.indent(sequence=4, offset=2)
yaml.width=1024

path = Path('keywords.csv')
df=pd.read_csv(path, encoding='cp932', header=None, skipinitialspace=True, dtype='str')
for n in range (3):
  df[n].fillna (method='ffill', replace=True)
df.fillna(', replace=True)

def df_to_dict(frame, currentindex, yamldict):
    nextindex=currentindex+1
    for key, workframe in frame.groupby(by=currentindex, sort=False):
        value=workframe.at [workframe.index.values[0], nextindex].trip()
        iflen(workframe) == 1:
            if value:
                uselist=False
                if value.startwith('-'):
                    value = value [2:]
                    uselist=True
                if value.startswith('"'):
                    value = dq(value.strip('"')
                if uselist:
                    value = [ value ]
                yamldict [key] = value
        elif value.startswith('-'):
            values = [v.strip() [2:] for vin workframe [nextindex].tolist()]
            yamldict [key] = values
        else:
            work2frame=workframe.iloc [:,workframe.columns!=currentindex]
            workdict={}
            df_to_dict (work2frame, nextindex, workdict)
            yamldict [key] = workdict

yamldict={}
df_to_dict(df,0, yamldict)

with open('config.yml', 'w') as yaml_file:
    yaml.dump(
        yamldict,
        yaml_file,
    )

The resulting files are as follows:

version:"3"
services:
  db:
    container_name —atcoder-stream-db
    image:postgres:12.1
    expose:
      - "5432"
  web-backend:
    container_name —atcoder-stream-backend
    build:
      context:./atcoder-stream-backend
      dockerfile —Dockerfile
    volumes:
      - ./atcoder-stream-backend/atcoder-stream-api
      - ./atcoder-stream-backend/libraries/lib/lib
      - ./atcoder-stream-backend/libraries/twitterapi/twitterapi
    ports:
      - "8000:8000"
    depend_on:
      - db
    command: sh-c "python/app/src/atcoder-stream-api/manage.py migrate &&python/app/src/atcoder-stream-api/manage.py runserver 0.0.0.0:8000"
  web-frontend:
    container_name —atcoder-stream-frontend
    build:
      context:./atcoder-stream-frontend
      dockerfile —Dockerfile
    volumes:
      - ./atcoder-stream-frontend:/app
    port:
      - "3000:3000"
    command: sh-c "cd/app&yarn start"


2022-09-30 19:39

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.